Extracting with ChromeDriver and RegEx

Example scripts and tips (replaces Old Scripts & Tips archive)

Moderators: Dorian (MJT support), JRL, Phil Pendlebury

Post Reply
hagchr
Macro Veteran
Posts: 293
Joined: Mon Jul 05, 2010 7:53 am
Location: Stockholm, Sweden

Extracting with ChromeDriver and RegEx

Post by hagchr » Tue Dec 29, 2020 11:40 am

I raised the following with the help desk some time ago but did not get any reply so I put it here as it may be helpful for some. Not sure if it is an issue or normal/expected.

When extracting text using ChromeDriver you typically get the text with Unix (LF) instead of Windows (CRLF). This means you have to be careful with Regex as end of line is treated differently.

The following simple pattern: (?-s)\A.+ you would expect to match just the first line. (?-s) means . (period) should not match end of line. However, in MS that pattern will match the whole text ie it does not recognize the end of lines.

To get around it I added an extra replacement of \n with \r\n and then the previous Regex gave the expected result. See example below:

Code: Select all

// Replace with local ChromeDriver.exe
Let>CHROMEDRIVER_EXE=C:\Users\Christer\Desktop\ChromeFile\chromedriver.exe

ChromeStart>session_id

Let>URL1=https://www.mjtnet.com
ChromeNavigate>session_id,url,URL1
Wait>2.0

Let>tmp=//div[@class='row']
ChromeFindElements>session_id,xpath,tmp,strElementID
Wait>0.1
ChromeGetElementData>session_id,strElementID_1,text,strResult
Wait>0.1

// Run regex to extract first line
RegEx>(?-s)\A.+,strResult,0,m1,nm1,0
MDL>%m1_1%

//Replace \n with \r\n
RegEx>\n,strResult,0,m,nm,1,\r\n,strResult

// Re-run same regex from before 
RegEx>(?-s)\A.+,strResult,0,m1,nm1,0
MDL>%m1_1%

Post Reply
cron
Sign up to our newsletter for free automation tips, tricks & discounts