October 29, 2012

My Most Used RegEx

Filed under: Automation,Scripting — Marcus Tettmar @ 10:33 am

It occurred to me the other day while working on a script for a customer that I use this regular expression frequently:

(?<=TOKEN1).*?(?=TOKEN2)

It is very useful when parsing information out of web pages, or when finding elements in web pages.

What it does is pull out all the text between TOKEN1 and TOKEN2. Those could be other pieces of text, or html characters, or whatever.

As an example, recently I wrote a script which loops through all rows in an HTML table, and pulls out an order number, then looks this order number up in an Excel sheet. The order number appeared in a table cell along with other information. It was the first item inside an <i> (italics) tag and was followed by a space and then a hyphen. So I used this to pull it out of the row:

RegEx>(?<=<i>).*?(?= -),this_row,0,matches,nm,0

See how it looks for everything between the ‘<i>’ and ‘ -‘ (space then hyphen).

The next thing my code needed to do was find the ID of the single input field in the same row. This input was used to enter the order quantity, obtained from the Excel sheet. The ID is not something we know up front but it’s the only input field in the row. So I did this:

RegEx>(?<=id=").*?(?="),theInput,0,matches,nm,0

In other words, pull the text between id=” and “, which gives us the input’s ID value. We can then use that later to identify and fill the input field.

Regular Expressions are daunting at first. But eventually you find a small number of patterns help in many situations. This is one that I often find useful.

What’s your oft-used regular expression?

October 3, 2017

Regular Expressions (RegEx) In Window Commands

Filed under: Scripting — Marcus Tettmar @ 8:04 am

Did you know you can use Regular Expressions inside any of Macro Scheduler‘s window functions (e.g. SetFocus, WaitWindowOpen, WaitWindowClosed) so that you can match a window title using RegEx?

As an example someone recently asked us how they would match a window from an old piece of software where the window title was always the letter “b” followed by any 8 digits, e.g. b23425461. Macro Scheduler uses the PCRE syntax, so to match a window title like this we could use this expression:

^b\d{8}

If you’re new to RegEx this means: match from the beginning of the string (^), then match a “b” character followed by any digit (\d) 8 times. {8} means match the proceeding token 8 times. So:

Let>WIN_USEREGEX=1
SetFocus>^b\d{8}

I’ve posted a little about Regular Expressions before. This post links to some useful resources if you are new to RegEx.

Regular Expressions for Dummies
My Most Used RegEx
Remove Tags From HTML with RegEx

August 5, 2016

Capture Screen Text using OCR

Filed under: Automation,General,Scripting — Marcus Tettmar @ 3:37 pm

Here’s a way to get screen text from any application – even from an image – using OCR and a free open source tool called Tesseract.

First, you need to download and install Tesseract. You can get it here.

Tesseract is a command line utility. The most basic syntax is:

tesseract.exe input_image_file output_text_file

So you could call it from a Macro Scheduler script something like this:

//Capture screen to bmp file - you could instead capture only a window or use FindObject to get coordinates of a specific object
GetScreenRes>X2,Y2
ScreenCapture>0,0,X2,Y2,%SCRIPT_DIR%\screen.bmp

//run tesseract on the screen grab and output to temporary file
Let>RP_WAIT=1
Let>RP_WINDOWMODE=0
Run>"C:\Program Files (x86)\Tesseract-OCR\tesseract.exe" "%SCRIPT_DIR%\screen.bmp" "%SCRIPT_DIR%\tmp"

//read temporary file into memory and delete it
ReadFile>%SCRIPT_DIR%\tmp.txt,theText
DeleteFile>%SCRIPT_DIR%\tmp.txt

//Display the text in a message box 
MessageModal>theText

This example simply captures the entire screen. You probably wouldn’t normally want to do this. Instead you could capture a specific window:

//Capture just the Notepad Window
SetFocus>Untitled - Notepad
GetWindowPos>Untitled - Notepad,X1,Y1
GetWindowSize>Untitled - Notepad,w,h
ScreenCapture>X1,Y1,{%X1%+%w%},{%Y1%+%h%},%SCRIPT_DIR%\screen.bmp

Or even a specific object:

//capture just the editor portion of notepad ... 
SetFocus>Untitled - Notepad
GetWindowHandle>Untitled - Notepad,hWndParent
FindObject>hWndParent,Edit,,1,hWnd,X1,Y1,X2,Y2,result
ScreenCapture>X1,Y1,X2,Y2,%SCRIPT_DIR%\screen.bmp

Either way you then have a screen bitmap you can pass into Tesseract.

Once you’ve retrieved the text you would probably want to parse it, using e.g. RegEx. Here’s an article on a RegEx expression useful for parsing out data.

November 7, 2014

Macro Scheduler 14.2 – Python, JSON, XML and Auto Documenting Macro Recorder

Filed under: Announcements,Macro Recorder — Marcus Tettmar @ 1:34 pm

I am pleased to announce that we have today released Macro Scheduler 14.2

Amongst other things this new version includes the following great new features:

  • Self Documenting Macro Recorder with snapshots of Windows/Objects being Activated/Clicked on
  • The ability to run Python code within your macros!
  • A native JSON Parser
  • A native XML Parser using XPath

Here’s an overview of these main new features, but for a more complete list of improvements view the history list here.

Self Documenting Macro Recorder

Ever recorded a macro and then tried to edit it but couldn’t figure out which bit did what? Well, now when you record a macro the macro recorder will take snapshots of windows that are activated and objects that you click on. These will be inserted into the script as image comments, right above the line of code they refer to. So, now, it’s easier to see exactly what your macro is doing and which bits you want to edit/copy.

Run Python Code Inside your Macros

In 14.2 there’s a new function called PyExec. It allows you to run real Python code and get back the values of any Python variables you specify. You’ll need to install the Python 2.7 DLL to your Macro Scheduler folder (or compiled .EXE folder) – there’s a link in the help file to a zip file with all the files you need.

You can run any Python code and even use third party Python imports. This is pretty wild IMO – the possibilities are endless. Here’s a simple example:

/*
First ensure Python27.dll and imports are in your Macro Scheduler program folder.
Download and unzip this file:
https://www.mjtnet.com/software/python27.zip
*/

Let>url=http://ip.jsontest.com/

/*
python_code:

import urllib2
import json

# grab data from http://ip.jsontest.com/ - see www.jsontest.com
response = urllib2.urlopen('%url%')

# load the json
dict = json.loads(response.read())

# get the ip member
myip = dict["ip"]

# make a nice string representation of the dict
sdict = json.dumps(dict)

# Anything we print to IO is returned in the PYExec output var
print "All Done"
*/

//Load the Python code to a variable
LabelToVar>python_code,pcode

//Run the code and request the values of the sdict and myip variables ...
PYExec>pcode,output,sdict,myip

//Display the IP address
MessageModal>Your pubic IP is: %myip%

Parsing JSON

In the last few years JSON has become a very popular way to transmit data objects. Most web services now use it so it cannot be ignored. While you can parse JSON using string handling/regex having an easy to use native parser is essential. Enter: JSONParse

In the above Python example we used Python’s own JSON handling to extract an IP address from some JSON retrieved from a web service. Here’s how we can do the exact same thing using native MacroScript code:

HTTPRequest>http://ip.jsontest.com/,,GET,,JSON
JSONParse>JSON,ip,myIP
MessageModal>Your pubic IP is: %myIP%

And here’s an example which gets data out of a more complicated structure:

//Requires Macro Scheduler 14.2

/*
MyJSON:
{ "uid" : "1234", 
  "clients" : ["client1","client2","client3"],
  "people" : [{"Name":"Marcus","Age":"21"},{"Name":"Dorian","Age":"18"}],
  "color" : "red",
  "size" : 14 }
*/

LabelToVar>MyJSON,sJSON

JSONParse>sJSON,uid,result
JSONParse>sJSON,clients,result
JSONParse>sJSON,clients[1],result
JSONParse>sJSON,people[1].Name,result

Parsing XML

Of course XML is also still very popular. Up til now we’ve had to parse XML using substring handling/regex or use Microsoft’s XML object via VBScript – which is a little over-complicated IMO. We now have a simple to use native XML parser function – XMLParse. Here’s an example:

//Requires Macro Scheduler 14.2

LabelToVar>XML,sXML
XMLParse>sXML,/bookstore/book,val,numBooks
Let>k=0
Repeat>k
  Let>k=k+1
  XMLParse>sXML,/bookstore/book[%k%]/title/text(),val,len
  MessageModal>val
Until>k=numBooks

/*
XML:
<?xml version="1.0" encoding="UTF-8"?>




  Everyday Italian
  Giada De Laurentiis
  2005
  30.00



  Harry Potter
  J K. Rowling
  2005
  29.99



  XQuery Kick Start
  James McGovern
  Per Bothner
  Kurt Cagle
  James Linn
  Vaidyanathan Nagarajan
  2003
  49.99



  Learning XML
  Erik T. Ray
  2003
  39.95



*/

How to Download/Upgrade

If you have a valid maintenance plan you can download the update from your account here. If your maintenance has lapsed you can also purchase upgrades and renew maintenance from within your account.

Trial versions can be downloaded here.