MUSHclient documentation: dialog IDD_LUA

Global Replace

Global Replace
This is for doing a powerful "global replace" using Lua's string.gsub function. The selected text (or all text in the window if none is selected) will be sent to the string.gsub function, with the specified "find" and "replacement" text being used for the function, like this: `new_selection = string.gsub (old_selection, find_text, replacement_text)` If there is an error raised by string.gsub (eg. bad regular expression) then no text is replaced. See the documentation for string.gsub in Lua for the exact workings of this function. Also see the forum postings: `http://www.gammon.com.au/forum/?id=6034 http://www.gammon.com.au/forum/?id=6138` The first one describes string.gsub in detail. The second one amplifies on the uses you can put this dialog box to. Find Pattern What to search for. This is a Lua-style regular expression. The standard patterns you can search for are: `. --- (a dot) represents all characters. %a --- all letters. %c --- all control characters. %d --- all digits. %l --- all lowercase letters. %p --- all punctuation characters. %s --- all space characters. %u --- all uppercase letters. %w --- all alphanumeric characters. %x --- all hexadecimal digits. %% --- a single '%' character. %1 --- captured pattern 1. %2 --- captured pattern 2 (and so on). %f[s] transition from not in set 's' to in set 's'. %b() balanced pair ( ... )` Important - the uppercase versions of the above represent the complement of the class. eg. %U represents everything except uppercase letters, %D represents everything except digits. There are some "magic characters" (such as %) that have special meanings. These are: `^ $ ( ) % . [ ] * + - ?` If you want to use those in a pattern (as themselves) you must precede them by a % symbol. eg. %% would match a single % As with normal MUSHclient regular expressions you can build your own pattern classes by using square brackets, eg. `[abc] ---> matches a, b or c [a-z] ---> matches lowercase letters (same as %l) [^abc] ---> matches anything except a, b or c [%a%d] ---> matches all letters and digits [%a%d_] ---> matches all letters, digits and underscore [%[%]] ---> matches square brackets (had to escape them with %)` The repetition characters are: `+ ---> 1 or more repetitions (greedy) * ---> 0 or more repetitions (greedy) - ---> 0 or more repetitions (non greedy) ? ---> 0 or 1 repetition only` The standard "anchor" characters apply: `^ ---> anchor to start of subject string $ ---> anchor to end of subject string` You can also use round brackets to specify "captures", similar to normal MUSHclient regular expressions: `You see (.) here` Here, whatever matches (.) becomes the first pattern. This can be referred to in the replacement box as %1. You can also refer to patterns in the regular expression. For example: `You see (.) and %1 here` This would match "You see fish and fish here", but not "You see fish and chips here". Note that carriage-return characters (\r) are removed from the text before being submitted for find-and-replace, so they will never appear in the selected text. They are replaced before the text is redisplayed in the notepad window. (...)* (edit find text) Click on the button marked "..." to open a larger dialog box for inputting the "find" regular expression more easily. Replacement This is the replacement text for each match, or in the case of "Call Function" (described below), the function name to be called. For every match of the regular expression in the Find Pattern box, the replacement text will be substituted. If you use %1, %2, etc. they will be replaced by the contents of the first/second capture pattern etc. To replace with % literally, enter %%. (...) (edit replacement text) Click on the button marked "..." to open a larger dialog box for inputting the "replacement" text more easily. Line By Line If checked, each individual line of the selection is sent to string.gsub as an individual item. If unchecked, the entire selection is processed as a batch. The main difference will be the processing of line breaks. If you check "Line By Line": There will never be a newline character in the text to be matched (as that is the linebreak character) The symbol ^ represents the start of each line The symbol $ represents the end of each line You cannot write a find/replace that will join lines together It will be slower to execute (particularly on large blocks of text) because the string.gsub function has to be called for each line If you uncheck "Line By Line": There may be newline characters in the text to be matched (as that is the linebreak character). You can test for these with \n, if you check "Escape Sequences" The symbol ^ represents the start of the entire selection The symbol $ represents the end of the entire selection You join lines together by searching for \n and replacing it with something else It will be faster to execute Escape Sequences You can enter non-printable characters in the Find and Replace boxes using the special codes documented for script function FixupEscapeSequences, eg. \n for newline, \t for tab. Call Function If checked, the function (which you can edit by clicking on the "Script" button), is applied to each match, rather then a simple text replacement. In this case the "Replacement" box should be the name of the required function. This is the alternative behaviour which is documented for string.gsub - supplying a function as the replacement rather than a string. Supplying a function lets you make more powerful replacements, such as looking up a replacement word in a table, or doing things like replacing lower-case with upper-case, and so on. The following libraries are available to your function: table, io, string, math, debug, rex, bits, compress, and utils. You can also enter into the replacement box directly the name of a function that already exists in the installed libraries, if this will do what you need without any extra work. For example: string.upper, string.lower, utils.hash, utils.base64encode. Script Click this button to enter your replacement function. This is a function written in the Lua language. If you click this, and enter some text, the "Call Function" checkbox is automatically checked. An example of doing this would be: `Find: [<>&] Replace: f Line by Line: no Call Function: yes Script: replacements = { ["<"] = "<", [">"] = ">", ["&"] = "&", } function f (str) return replacements [str] or str end` This would make an "HTML fixup" replacement, that replaces special HTML codes by the replacements shown in the table above. Another example of using a function: `Find: %f[%a]%u+%f[%A] Replace: string.lower Line by Line: no Call Function: yes` This searches for words that are in all upper-case (by searching for %u which is upper case letters), surrounded by something that is not a letter). When found the (inbuilt) function "string.lower" is called that converts it to lower-case. To explain this a bit further, if we had simply searched for %u+ then we would also have matched on upper-case letters inside words which also had lower-case in them. I initially had the Find pattern as: "%A%u+%A". This matches one non-letter followed by all uppercase letters, followed by one non-letter. Whilst this successfully found all upper-case words, it failed on the text boundary (that is, it would not convert an all upper-case word at the start or end of the selection). However, using the little-known Lua "frontier" pattern, we detect the transition from not letters to letters %f[%a], which is also the start of the pattern, and finish off with %f[%A] which is the transition between letters to non-letters. OK Click OK to perform the find-and-replace on the selected text. If no text is selected the operation is performed on the entire notepad window. If you are not happy with the results, immediately perform an Undo (Edit Menu > Undo) or press Ctrl+Z. When you click OK the contents of the dialog box are remembered for this session of MUSHclient, over all instances of Notepad windows. Thus, if the results are not quite right, you can Undo them, re-open the dialog box, make amendments, and try again. X characters selected. (N line breaks) This field, next to the OK button, shows how much text is selected. If there is no selection, or all text is selected, it will read "All text selected.". The count of line breaks shows how many newline characters have been found in the selection. Thus, a selection of one line, and part of the next line, will show "1 line break" even though 2 lines may be visibly selected. See Also ... Topic Notepad Command (GlobalReplace) Does a global find-and-replace on the selected text using string.gsub Dialog Multi-line Edit Functions (ActivateNotepad) Activates a notepad window (AppendToNotepad) Appends text to a notepad window (CloseNotepad) Closes a notepad window (GetNotepadLength) Gets the length of the text in a notepad window (GetNotepadList) Gets the list of open notepads - returning their titles (GetNotepadText) Gets the text from a notepad window (MoveNotepadWindow) Move and resize the specified notepad window (NotepadColour) Changes the text and background colour of the selected notepad window (NotepadFont) Changes the font and style of the selected notepad window (NotepadReadOnly) Make a selected notepad window read-only (NotepadSaveMethod) Changes the save method for this notepad window (ReplaceNotepad) Replaces text in a notepad window (SaveNotepad) Saves a notepad window to disk (SendToNotepad) Creates a notepad and sends text to it (Help topic: dialog=IDD_LUA_GSUB)

This is for doing a powerful "global replace" using Lua's string.gsub function.

The selected text (or all text in the window if none is selected) will be sent to the string.gsub function, with the specified "find" and "replacement" text being used for the function, like this:


new_selection = string.gsub (old_selection, find_text, replacement_text)

If there is an error raised by string.gsub (eg. bad regular expression) then no text is replaced.

See the documentation for string.gsub in Lua for the exact workings of this function. Also see the forum postings:


http://www.gammon.com.au/forum/?id=6034
http://www.gammon.com.au/forum/?id=6138

The first one describes string.gsub in detail. The second one amplifies on the uses you can put this dialog box to.

Find Pattern

What to search for. This is a Lua-style regular expression.

The standard patterns you can search for are:


 . --- (a dot) represents all characters. 
%a --- all letters. 
%c --- all control characters. 
%d --- all digits. 
%l --- all lowercase letters. 
%p --- all punctuation characters. 
%s --- all space characters. 
%u --- all uppercase letters. 
%w --- all alphanumeric characters. 
%x --- all hexadecimal digits. 
%% --- a single '%' character.
%1 --- captured pattern 1.
%2 --- captured pattern 2 (and so on).
%f[s]  transition from not in set 's' to in set 's'.
%b()   balanced pair ( ... )

Important - the uppercase versions of the above represent the complement of the class. eg. %U represents everything except uppercase letters, %D represents everything except digits.

There are some "magic characters" (such as %) that have special meanings. These are:


^ $ ( ) % . [ ] * + - ?

If you want to use those in a pattern (as themselves) you must precede them by a % symbol.

eg. %% would match a single %

As with normal MUSHclient regular expressions you can build your own pattern classes by using square brackets, eg.


[abc] ---> matches a, b or c
[a-z] ---> matches lowercase letters (same as %l)
[^abc] ---> matches anything except a, b or c
[%a%d] ---> matches all letters and digits
[%a%d_] ---> matches all letters, digits and underscore
[%[%]] ---> matches square brackets (had to escape them with %)

The repetition characters are:


+  ---> 1 or more repetitions (greedy)
*  ---> 0 or more repetitions (greedy)
-  ---> 0 or more repetitions (non greedy)
?  ---> 0 or 1 repetition only

The standard "anchor" characters apply:


^  ---> anchor to start of subject string
$  ---> anchor to end of subject string

You can also use round brackets to specify "captures", similar to normal MUSHclient regular expressions:


You see (.*) here

Here, whatever matches (.*) becomes the first pattern. This can be referred to in the replacement box as %1.

You can also refer to patterns in the regular expression. For example:


You see (.*) and %1 here

This would match "You see fish and fish here", but not "You see fish and chips here".

Note that carriage-return characters (\r) are removed from the text before being submitted for find-and-replace, so they will never appear in the selected text. They are replaced before the text is redisplayed in the notepad window.

(...) (edit find text)

Click on the button marked "..." to open a larger dialog box for inputting the "find" regular expression more easily.

Replacement

This is the replacement text for each match, or in the case of "Call Function" (described below), the function name to be called.

For every match of the regular expression in the Find Pattern box, the replacement text will be substituted. If you use %1, %2, etc. they will be replaced by the contents of the first/second capture pattern etc.

To replace with % literally, enter %%.

(...) (edit replacement text)

Click on the button marked "..." to open a larger dialog box for inputting the "replacement" text more easily.

Line By Line

If checked, each individual line of the selection is sent to string.gsub as an individual item. If unchecked, the entire selection is processed as a batch.

The main difference will be the processing of line breaks.

If you check "Line By Line":

There will never be a newline character in the text to be matched (as that is the linebreak character)
The symbol ^ represents the start of each line
The symbol $ represents the end of each line
You cannot write a find/replace that will join lines together
It will be slower to execute (particularly on large blocks of text) because the string.gsub function has to be called for each line

If you uncheck "Line By Line":

There may be newline characters in the text to be matched (as that is the linebreak character). You can test for these with \n, if you check "Escape Sequences"
The symbol ^ represents the start of the entire selection
The symbol $ represents the end of the entire selection
You join lines together by searching for \n and replacing it with something else
It will be faster to execute

Escape Sequences

You can enter non-printable characters in the Find and Replace boxes using the special codes documented for script function FixupEscapeSequences, eg. \n for newline, \t for tab.

Call Function

If checked, the function (which you can edit by clicking on the "Script" button), is applied to each match, rather then a simple text replacement. In this case the "Replacement" box should be the name of the required function. This is the alternative behaviour which is documented for string.gsub - supplying a function as the replacement rather than a string.

Supplying a function lets you make more powerful replacements, such as looking up a replacement word in a table, or doing things like replacing lower-case with upper-case, and so on.

The following libraries are available to your function: table, io, string, math, debug, rex, bits, compress, and utils.

You can also enter into the replacement box directly the name of a function that already exists in the installed libraries, if this will do what you need without any extra work. For example: string.upper, string.lower, utils.hash, utils.base64encode.

Script

Click this button to enter your replacement function. This is a function written in the Lua language. If you click this, and enter some text, the "Call Function" checkbox is automatically checked.

An example of doing this would be:


Find:  [<>&]
Replace: f
Line by Line: no
Call Function: yes
Script:

replacements = { 
   ["<"] = "&lt;",
   [">"] = "&gt;",
   ["&"] = "&amp;",
   }

function f (str)
  return replacements [str] or str
  end

This would make an "HTML fixup" replacement, that replaces special HTML codes by the replacements shown in the table above.

Another example of using a function:


Find: %f[%a]%u+%f[%A]
Replace: string.lower
Line by Line: no
Call Function: yes

This searches for words that are in all upper-case (by searching for %u which is upper case letters), surrounded by something that is not a letter). When found the (inbuilt) function "string.lower" is called that converts it to lower-case.

To explain this a bit further, if we had simply searched for %u+ then we would also have matched on upper-case letters inside words which also had lower-case in them. I initially had the Find pattern as: "%A%u+%A". This matches one non-letter followed by all uppercase letters, followed by one non-letter. Whilst this successfully found all upper-case words, it failed on the text boundary (that is, it would not convert an all upper-case word at the start or end of the selection). However, using the little-known Lua "frontier" pattern, we detect the transition from not letters to letters %f[%a], which is also the start of the pattern, and finish off with %f[%A] which is the transition between letters to non-letters.

OK

Click OK to perform the find-and-replace on the selected text. If no text is selected the operation is performed on the entire notepad window.

If you are not happy with the results, immediately perform an Undo (Edit Menu > Undo) or press Ctrl+Z.

When you click OK the contents of the dialog box are remembered for this session of MUSHclient, over all instances of Notepad windows. Thus, if the results are not quite right, you can Undo them, re-open the dialog box, make amendments, and try again.

X characters selected. (N line breaks)

This field, next to the OK button, shows how much text is selected. If there is no selection, or all text is selected, it will read "All text selected.". The count of line breaks shows how many newline characters have been found in the selection. Thus, a selection of one line, and part of the next line, will show "1 line break" even though 2 lines may be visibly selected.

See Also ...

Topic

Notepad