My favourite RegEx

Anything Really. Just keep it clean!

Moderators: Dorian (MJT support), JRL

Post Reply
User avatar
Marcus Tettmar
Site Admin
Posts: 7380
Joined: Thu Sep 19, 2002 3:00 pm
Location: Dorset, UK
Contact:

My favourite RegEx

Post by Marcus Tettmar » Fri Oct 19, 2012 5:04 pm

Suddenly dawned on me that there's one regular expression I use almost every day:

(?<=TOKEN1).*?(?=TOKEN2)

What this does is pull all text out from between two other pieces of text matching token1 and token2. Very useful for parsing data from websites.

I should blog this with a website example.

What's your favourite regex?
Marcus Tettmar
http://mjtnet.com/blog/ | http://twitter.com/marcustettmar

Did you know we are now offering affordable monthly subscriptions for Macro Scheduler Standard?

User avatar
Dorian (MJT support)
Automation Wizard
Posts: 1348
Joined: Sun Nov 03, 2002 3:19 am
Contact:

Post by Dorian (MJT support) » Fri Nov 02, 2012 12:44 am

I'd never used regex until I saw this post, and it prompted me to do a little research.

It's very powerful. I used it to strip out all the URLs from a text file. Here it is, in case it helps anyone.

Code: Select all

//sample text
let>text=rfuhroiurnfifnroi http://www.fish.com fuh3ifuh34ifurf http://www.chips.com

// it seems pattern looks for http://, not www. so we'll add in the http://
stringreplace>text,www,http://www,text

// Find URLS
RegEx>[Hyperlink],text,1,matches,num,0

// write it all to a file
Let>k=0
Repeat>k
Let>k=k+1
WriteLn>%USERDOCUMENTS_DIR%\url output.txt,result,matches_%k%
Until>k,num

Yes, we have a Custom Scripting Service. Message me or go here

jpalic
Newbie
Posts: 17
Joined: Fri Aug 01, 2008 6:32 pm

My favourite Regex

Post by jpalic » Wed Nov 07, 2012 7:06 pm

Marcus,

I use something similar to yours although I never quite understood the meaning of the zero-width positive lookbehind (?<=regex) or the zero-width positive lookahead (?=regex) modifiers.

What issues do you avoid by using those modifiers in this regex?

Jim

PS - There is a good regex reference here:

http://www.regular-expressions.info/refadv.html

User avatar
Marcus Tettmar
Site Admin
Posts: 7380
Joined: Thu Sep 19, 2002 3:00 pm
Location: Dorset, UK
Contact:

Post by Marcus Tettmar » Wed Nov 07, 2012 7:15 pm

What these do is exclude those tokens from the match so that we get only the value between them.
Marcus Tettmar
http://mjtnet.com/blog/ | http://twitter.com/marcustettmar

Did you know we are now offering affordable monthly subscriptions for Macro Scheduler Standard?

Post Reply
Sign up to our newsletter for free automation tips, tricks & discounts