RegEx multi line syntax

Technical support and scripting issues

Moderators: Dorian (MJT support), JRL

Post Reply
obfusc88
Pro Scripter
Posts: 85
Joined: Wed Mar 14, 2007 6:22 pm

RegEx multi line syntax

Post by obfusc88 » Fri May 11, 2012 7:44 am

I am trying to remove lines that begin with "WriteLn("Line" (without quotes) from a file with thousands of lines. Cannot get it to work. I am sure the problem is with the modifier (?m). I have tried (?-m), have changed $ to \n, have included and removed the ^. Can anyone see the problem?

Here is my script:

Code: Select all

ReadFile>%TEMP_DIR%\TraceFile.ses,vTempFile
Let>vFind=(?m)^WriteLn\("Line.*$
Let>vSource=%vTempFile%
RegEx>%vFind%,%vSource%,0,vMatchArray,vNumMatches,1,,vTempFile2
The TraceFile.ses looks like this (truncated lines here):

Code: Select all

WriteLn("Line 319: vTemp = @SpecCommand(5, 1, '')WriteLn("Line 319: vTemp = @SpecCommand(5, 1
vTemp = @SpecCommand(5, 1, "") 
WriteLn("Line 320: vTemp = @SpecCommand(4, 1
vTemp = @SpecCommand(4, 1, "MetaPhoneSearch=
WriteLn("Line 321: vTemp = @SpecCommand(2, 1
vTemp = @SpecCommand(2, 1, "") 
WriteLn("Line 322: }                      ##
}
WriteLn("Line 323: }                      ##
}
WriteLn("Line 324: }                      ##
}

// This group includes program lines 310 thr


** PROGRAMMING SECTION: [txtCodingName] [On 

WriteLn("Line 325: ThrowFocus(txtInputName) 
ThrowFocus(txtInputName)

// This group includes program lines 325 thr


** PROGRAMMING SECTION: [cmdSoundexExplanati

var n as Int
WriteLn("Line 326: n = @AsynchShell('http://
n = @AsynchShell("http://blog.eogn.com/eastm
The result, vTempFile2 should be like this:

Code: Select all

vTemp = @SpecCommand(5, 1, "") 
vTemp = @SpecCommand(4, 1, "MetaPhoneSearch=
vTemp = @SpecCommand(2, 1, "") 
}
}
}

// This group includes program lines 310 thr


** PROGRAMMING SECTION: [txtCodingName] [On 

ThrowFocus(txtInputName)

// This group includes program lines 325 thr


** PROGRAMMING SECTION: [cmdSoundexExplanati

var n as Int
n = @AsynchShell("http://blog.eogn.com/eastm
I guess I could read the source file one line at a time and write it out to another file if the line does not begin with "Write.....". But I would prefer do do this with a single RegEx that I suspect will be much faster.

User avatar
Marcus Tettmar
Site Admin
Posts: 7378
Joined: Thu Sep 19, 2002 3:00 pm
Location: Dorset, UK
Contact:

Post by Marcus Tettmar » Fri May 11, 2012 11:06 am

This is my solution:

Code: Select all

ReadFile>%USERDOCUMENTS_DIR%\test1.txt,data
Let>pattern=(?<=^|\n)WriteLn\(\"Line.*?(?=\n|$)
RegEx>pattern,data,0,matches_1,nm,1,\n,new_data
MessageModal>new_data
Here's my input test file:

Code: Select all

1
2
WriteLn("Line1212
WriteLn("Line333
WriteLn("Line22
WriteLn("Line12
WriteLn("Line22
3
WriteLn("Line
4
I end up with:

Code: Select all

1
2
3
4
You would just need to delete the file and then write new_data back to it.
Marcus Tettmar
http://mjtnet.com/blog/ | http://twitter.com/marcustettmar

Did you know we are now offering affordable monthly subscriptions for Macro Scheduler Standard?

obfusc88
Pro Scripter
Posts: 85
Joined: Wed Mar 14, 2007 6:22 pm

Post by obfusc88 » Fri May 11, 2012 5:45 pm

Thanks Marcus.

I get the same results as you with your test file, but my real file removes the text from the line, but leaves the \n there. Actually my results showed two blank lines. So, I removed the \n in your ReplaceText parameter and used nothing, like my original, so now I only have one blank line, but I am still unable to remove the blank line completely.

So, the difference must be in the source data. My sample file was truncated, but the following is is typical of the format of all the WriteLn lines:

Code: Select all

WriteLn("Line 319: vTemp = @SpecCommand(5, 1, '')                       ## //LastName //On Form Entry")
I cannot see anything unusual about those extra characters.
## //LastName //On Form Entry") with changing text is at the end of every with a leading group of spaces.

In your RegEx, I also note that you treated " as a special metacharacter, using a backslash. Why did you treat " as special? Note the " as the second last char on these lines.

Re the truncated lines, I thought the .* at the end would handle all the trailing chars, but there must be something special that I can't see.

My original file also includes some blank lines which should remain there. Your test data does not include blank lines.

Here is another sample of real code using your test format...

Code: Select all

1
WriteLn("Line 315: txtInputName=@Mid(vTemp,vPos+10,@Len(LastName))                      ## //LastName //On Form Entry")
2
WriteLn("Line 316: txtCodingName = fnPrepareName(txtInputName)                      ## //LastName //On Form Entry")
3

WriteLn("Line 317: vScode = fnMakeSCode(txtCodingName)                      ## //LastName //On Form Entry")
4
WriteLn("Line 318: vMCode = fnMakeMCode(txtCodingName)                      ## //LastName //On Form Entry")

5 
WriteLn("Line 319: vTemp = @SpecCommand(5, 1, '')                       ## //LastName //On Form Entry")
WriteLn("Line 320: vTemp2 = @SpecCommand(5, 3, '')                       ## //LastName //On Form Entry")
6
-------------
This should result in:
1
2
3

4

5
6
------------------------
But I keep getting
1
2
3
4
5

6
------------------------
I am now reviewing your lookahead/lookback/lookaround, but not familiar with that RegEx yet, am researching it now......

With this new info, can you now duplicate the problem and provide a new fix? I see how you included the \n from the previous line, I think that was the real clue to capturing this line. (Am using MS 12.1.10) Thanks again...

User avatar
Marcus Tettmar
Site Admin
Posts: 7378
Joined: Thu Sep 19, 2002 3:00 pm
Location: Dorset, UK
Contact:

Post by Marcus Tettmar » Mon May 14, 2012 10:45 am

If I copy and paste your sample file into a new text file here and then run my code I get the correct results that you expect. So I think there must be something else in your file which is not present when you just paste the text into the forum. Can you therefore email your text file to support or post a link to it?
Marcus Tettmar
http://mjtnet.com/blog/ | http://twitter.com/marcustettmar

Did you know we are now offering affordable monthly subscriptions for Macro Scheduler Standard?

obfusc88
Pro Scripter
Posts: 85
Joined: Wed Mar 14, 2007 6:22 pm

Fixed - RegEx multi line syntax

Post by obfusc88 » Tue May 15, 2012 6:33 pm

No need to submit file. Your solution worked perfectly. The final problem was me, leaving out a "insignificant" line of code (dummy, ME).

When I tried your code I did not include the MessageModal line. I was just looking at the results in the WatchList. I did not see spaces after the numbers 3,4 so I thought it was not working. When I actually printed out the content with the MessageModal command I saw totally different and correct results.
Lesson learned: Do not rely on WatchList visual of results. Non printing characters will make a big difference.

Thanks again for the solution.

Final question, still not answered: why did you escape the double quote sign?

User avatar
Marcus Tettmar
Site Admin
Posts: 7378
Joined: Thu Sep 19, 2002 3:00 pm
Location: Dorset, UK
Contact:

Post by Marcus Tettmar » Wed May 16, 2012 8:46 am

Code: Select all

Final question, still not answered: why did you escape the double quote sign?
I guess I thought I needed to. Seems I was wrong.
Marcus Tettmar
http://mjtnet.com/blog/ | http://twitter.com/marcustettmar

Did you know we are now offering affordable monthly subscriptions for Macro Scheduler Standard?

obfusc88
Pro Scripter
Posts: 85
Joined: Wed Mar 14, 2007 6:22 pm

Post by obfusc88 » Thu May 17, 2012 3:03 am

OK, thanks. You may have made a minor error there, but you provided another teaching moment. Apparently, if you escape a character incorrectly, it doesn't affect the result, so....if in doubt, escape it anyway :wink:

Post Reply
Sign up to our newsletter for free automation tips, tricks & discounts