Multi-line match with tricky symbols

Posted by Larkin   (278 posts)  [Biography] bio
Date Sun 10 Jul 2005 01:58 PM (UTC)

Amended on Sun 10 Jul 2005 02:00 PM (UTC) by Larkin

I'm attempting to put together a good channel capture script for IRE games. I started with the very helpful thread where Nick gave triggers and VBscript for parsing the channels. I converted the VBscript to a couple lines of Lua, and it works great for converting the newlines to spaces.

My problem isn't so much with Lua, I guess. I need a regular expression that can pick up the right strings to capture. I'll provide a few examples to illustrate my problem.

3519h, 4158m, 16495e, 17800w ex-
(Hashan): Person says, "Hm.. Weird, isn't mending salve supposed to mend both 
legs at once?"
3519h, 4158m, 16495e, 17800w ex-
(The Hashanite Legion): Person says, "Person tells you, "Go away now!""
3519h, 4158m, 16495e, 17800w ex-

The first matches with Nick's pattern. The second won't match using Nick's pattern because of the extra set of quotes. People quote other people all the time, so this is a common problem.

What I'm looking for is a pattern that can match everything until the last quote and "end-of-buffer" character, I think. My matches either pick up nothing or far too much.

    match="\((.+?)\)\: (\w+) says?\, &quot;([^&quot;]+)&quot;\Z"
    sequence="100" />

function OnChat(name, line, wildcards)
  msg = wildcards[1] .. " <" .. wildcards[2] .. "> - " .. string.gsub(string.gsub(wildcards[3], "\n", " "), "  ", " ") .. "\r\n"
  AppendToNotepad("Channels", msg)

Here, wildcard 1 is the channel name, wildcard 2 is the name of the person talking, and wildcard 3 is what they said. I tried changing the [^&quot;]+ to something like (?:.|\n)+, and it doesn't do what I want.
Posted by Flannel   USA  (1,230 posts)  [Biography] bio
Date Reply #1 on Sun 10 Jul 2005 07:53 PM (UTC)

Amended on Sun 10 Jul 2005 07:55 PM (UTC) by Flannel

Using ([\d\D]+) works. You just need something that will match 'anything else' to take care of the things like newlines and stuff (can't use . since we'd need /s as part of our regexp, which we cant do). Likewise, any negative match will work (well, except it won't match whatever you included in your negative match). But, \d\D does indeed work (and is sanctioned by Mr. Schwartz, unlike \w\W, although I don't think he'd have any complaints).

Hmm, perhaps Nick could include the /s modifier with the multiline triggers? Or as at least as a further option, since I'm sure theres a good reason he hasn't done so on all of them already.


Messiah of Rose
Eternity's Trials.

Clones are people two.
Posted by tobiassjosten   Sweden  (79 posts)  [Biography] bio
Date Reply #2 on Mon 11 Jul 2005 10:56 AM (UTC)

Amended on Mon 11 Jul 2005 10:57 AM (UTC) by tobiassjosten

I use the prompt to determine when a paragraph of text is complete, it's viable for all the IRE-MUDS. Here's what I use for capturing and coloring tells/channels.

    <alias match="^_prompt" enabled="y" regexp="y" send_to="12" keep_evaluating="y" omit_from_output="y" omit_from_log="y">

    <trigger enabled="y" keep_evaluating="y" match="^(\(|\<\<)(.+)(\)|\>\>)\: (\w+) says?\, \&quot;(.+)" regexp="y" omit_from_output="y" script="ChatCap" match_text_colour="y" text_colour="11" sequence="40"></trigger>
    <trigger enabled="y" keep_evaluating="y" match="^(\w+) tells? (.+)\, \&quot;(.+)" regexp="y" omit_from_output="y" script="ChatCap" match_text_colour="y" text_colour="11" sequence="40"></trigger>
    <trigger name="chatcap" enabled="n" keep_evaluating="y" match=".+" regexp="y" script="ChatCap" custom_colour="17" match_text_colour="y" text_colour="11" other_text_colour="silver" sequence="30"></trigger>

capline = false

function ChatCap(name, line, wildcards)
    if not GetTriggerInfo("chatcap", 8) and wildcards[2] then
        local colour = "seashell"
        if wildcards[2] == "Congregation" then colour = "lime" end
        if wildcards[2] == "Guild1" or wildcards[2] == "Guild2" then colour = "magenta" end
        if wildcards[2] == "Novice1" or wildcards[2] == "Novice2" then colour = "violet" end
        if wildcards[2] == "City1" or wildcards[2] == "City2" then colour = "lavender" end
        if wildcards[2] == "Market" then colour = "goldenrod" end
        if wildcards[5] then -- Is it a channel?
            ColourNote(colour, "", wildcards[1] .. wildcards[2] .. wildcards[3] .. ": ", "silver", "", wildcards[4] .. " say, \"" .. wildcards[5])
        else -- Or a tell?
            ColourNote("yellow", "", wildcards[1], "silver", "", " tell ", "yellow", "", wildcards[2], "silver", "", ", \"" .. wildcards[3])
    if not capline then capline = line
    else capline = capline .. " " .. line end
    EnableTrigger ("chatcap", true)
function ChanCapStop(name, line, wildcards)
    EnableTrigger ("chatcap", false)
    if capline then AppendToNotepad("ChatCap", "\r\n" .. capline .. "\r\n") end
    capline = false

Simplicity is Divine | http://nogfx.org/
Posted by Nick Gammon   Australia  (21,322 posts)  [Biography] bio   Forum Administrator
Date Reply #3 on Sat 29 Oct 2005 02:52 AM (UTC)
How about a simpler pattern?

\((.+?)\)\: (\w+) says?\, "(.+)"\Z

Your pattern is matching:

(someone) says, "(anything except a quote)"(end of subject)

Your problem is, that the things people say may include quotes.

I tried a smarter version, which checks for quotes inside the quotes, but that really relies on them doing "balanced" quotes, and would be thrown out if some said something like:

Nick"s dog

My pattern is:

(someone) says, "(anything at all)"(end of subject)

The fact that you are still looking for the trailing quote and \Z should stop extraneous matches.

- Nick Gammon

www.gammon.com.au, www.mushclient.com
