March 23, 2018

Screen Scraping with Macro Scheduler [Update]

Filed under: Automation — Marcus Tettmar @ 9:55 pm

What is Screen Scraping?

Screen Scraping is a term used to describe the process of a computer program or macro extracting data from the display output of another application. Rather than parsing data from the database or data files belonging to an application, Screen Scraping pulls the data from the screen itself, extracting data that was intended to be displayed to the end-user as opposed to data designed for output to another application or database.  

Screen Scraping is necessary when there is a need to access the information displayed by the application but there is no method provided to access it behind the scenes. The database or data files may not be accessible, or may be undocumented or proprietary and therefore cannot be parsed easily; the costs associated with interacting with the database may be too high; or the license agreement or warranty prohibits it. In the case of legacy systems that are no longer supported there may be no knowledge of the data structures, or the technology used is no longer compatible with current technology. In these cases we are resorted to extracting the data from the screen – from the windows of the application.

The term Screen Scraping probably originates from the era of computer terminals when you could connect the terminal output of a computer to an input port on another and therefore record the screen data.  

Screen Scraping Methods

There are a number of ways we can retrieve information from the screen using Macro Scheduler, depending on the type of application the data is in.

Screen Scraping Web Applications

Applications like Macro Scheduler’s WebRecorder can access the data and objects inside an Interner Explorer window and can therefore be used to extract the data.  Technically speaking I would not call this screen scraping since WebRecorder is using an API interface provided by Internet Explorer, but the process of extracting information from web sites is commonly refered to as Screen Scraping.  With WebRecorder we can use the ExtractTag wizard to create code that extracts the text from a particular element in the page.   While WebRecorder is the easiest way to do it, it is also possible to automate IE and extract data from web pages by using VBScript. The following forum posts may help:

Automate Internet Explorer with OLE/ActiveX
Automate web forms with IE
HTTP GET and POST using VBscript

Screen Scraping Microsoft Office Applications

Microsoft Office Applications, like Internet Explorer, have a COM interface that allows scripts to manipulate them and access the data held within them.  Again, not really scraping data from the screen itself, as you are getting it directly from a programming interface. There are a number of examples in the forums and blog archives and also some sample scripts that come with Macro Scheduler which demonstrate how to automate Office applications and retrieve data from them.  

Working with Excel

Screen Scraping Regular Windows Applications

Most other applications don’t offer a scripting interface like MS Office or Internet Explorer.  This is where we really need to work directly with the screen.   There are a number of ways we can do this kind of Screen Scraping with Macro Scheduler.

Screen Scraping via Optical Character Recognition

Macro Scheduler 14.4 includes some really neat functions which make it really easy to OCR a portion of the screen:

  • OCRScreen
  • OCRWindow
  • OCRArea

The first of these is the simplest. It simply scans the entire screen and returns all the text it can recognise. Of course this is also the slowest as it has to perform OCR against the entire screen. OCRWindow takes a window title and scans only the area of the screen where that window appears. This is nice and simple and a good compromise if the window isn’t too large. Finally, OCRArea can be given a rectangular screen region (X1,Y1,X2,Y2). You could use FindObject to find the coordinates of a specific UI object and pass those coordinates to OCRArea if you want to narrow things right down.

The Text Capture Functions

Macro Scheduler includes some Text Capture functions which can be used to extract text from a given window, rectangular screen area or screen point.  These functions use low level system hooks which monitor applications calling the various “TextOut” functions that Windows uses to output text to the screen.  By doing so they are able to capture this text.  The Text Capture functions return the text to  a variable which you can then use as needed.  

However, a few applications don’t use the Windows built-in functions to create and output text.  Don’t worry –Most do, but a few use their own techniques.  When you realise that text on the screen is just a sequence of small dots, if the application programmer decided to build his own routine to assemble text from dots rather than calling the Windows functions which already do that for you, you’re not going to be able to capture it.

The text capture functions and their limitations are explained here.  There is an example application, here, created with Macro Scheduler, which you can use to determine whether or not the text you want to capture can be captured using the text capture functions.

http://www.mjtnet.com/blog/2007/12/12/capturing-screen-text/
http://www.mjtnet.com/blog/2008/01/03/screen-scrape-text-capture-example/

Using the Clipboard for Screen Scraping

If the text you want to capture is selectable then you can use the clipboard to retrieve it.  A Macro Scheduler macro can send the keystrokes necessary to highlight and copy the text to the clipboard and then use the GetClipboard function to retrieve that text to a variable.  This is far less elegant than using the Text Capture functions but might be necessary if the application concerned is not utlising any of the Windows text out functions to create the text.

SetFocus>Notepad*
//Select ALL
Press CTRL
Send>a
Release CTRL
//Copy to clipboard
Press CTRL
Send>c
Release CTRL

//Get and display the data
WaitClipboard
GetClipboard>theData
MessageModal>theData

March 15, 2018

Saving You Time and Money With Robotic Process Automation

Filed under: Automation,General,Macro Recorder — Marcus Tettmar @ 3:24 pm

Before Macro Scheduler existed I was a junior member of an IT department. As part of my job I built small tools to automate specific tasks. At the time I was using VB. Each time I had to automate a task I had to reinvent the wheel a little.

The whole point of Macro Scheduler was to simplify the task of building automation routines, or software robots. To avoid having to reinvent the wheel.

There’s no reason why you can’t automate your Excel-to-SAP/WEB/ERP/ACME-Desktop-App by writing code using C# or C++. But it’s going to take you longer than using Macro Scheduler. Macro Scheduler functions like “SendText” and “UISetValue” encapsulate some pretty low level and quite convoluted code. The code wizards and macro recorders which help you use them are even more complicated.

One of the main purposes of Macro Scheduler is therefore to enable people to automate things more quickly and more easily than could be done with traditional programming tools. It makes it possible for non-programmers but also simplifies and speeds up automation for developers.

Over the last 21 years – yes 21, Macro Scheduler was first launched in 1997! – we’ve helped people with a lot of automation tasks. We offer consultancy and have been into people’s offices and also helped over the phone and remote desktop. Most routines take us a few hours to create a macro for. Some take a day or two. Rarely do we need to spend more than three days on one process, though there are some projects which involve a series of automation routines that may therefore take longer or be done over a few sessions.

To do the job from scratch with C#, C++ or VB might take weeks. Many people who approach us seem to be imagining that to automate their task may take days or weeks. They are often very surprised when we tell them it’s a few hours not a few days.

We are all about saving time and money. That’s what our tools do and it’s why we built them. Our tools mean you don’t have to pay developers lots of money to spend weeks or months building custom solutions.

The only down side to this from our point of view is that we routinely disappoint large consultancy businesses and potential partners who are used to selling IT contracts worth hundreds of thousands. They approach us thinking that there’s a huge opportunity and that we’ll pay them a large cut of a big consultancy project.

But rarely does a job require so much time, and that’s the reason we’re here. Sometimes I sense these people want us to “flesh” projects out. They think we’re “too good” at what we do. We should slow down, make things more complicated and thus charge for more time.

But that isn’t us. That’s not why we’re here or why we created Macro Scheduler. The whole ethos of Macro Scheduler and MJT Net Ltd is to save time, to find more efficient, less expensive ways of doing things that were once thought impossible or too expensive to do, and to enable people to automate without specialised knowledge.

That said we’re happy to work with companies on an on-going basis. I have found that most businesses approach us with one specific routine in mind and then when they see what can be done they realise how it can be applied across the organisation in other departments, for other teams. Saving a team one hour a week may not seem like much, but do that for 500 other teams and it adds up to a huge efficiency saving for the entire business.

If you would like to talk to us to find out how we could help your business, whether on an ongoing basis or just for a one-off job, please drop me a line here.

March 2, 2017

Demo: Automating Data Entry from Excel to Web

Filed under: Automation,Tutorials — Marcus Tettmar @ 8:24 am

Here’s a little video Dorian has just put together to demonstrate the automation of data transfer from Excel into a web form:

September 15, 2016

Finding Window Titles You Cannot See

Filed under: Automation,Scripting — Marcus Tettmar @ 11:33 am

Someone emailed today saying they were having problems trying to automate Internet Explorer 11 because it didn’t seem to have a window title.

Actually IE11 does have a window title. Each tab has a different window title. But you don’t see the title in the main title bar of the application.

By default applications show the window title in the title bar. Hence it’s name. But some apps manipulate the appearance of their title bar so that it doesn’t look like a regular Windows title bar. Indeed some apps have all borders removed so that you can’t SEE the title bar. But in all cases, the window will still have a title (unless it’s an empty string!).

So if you can’t see the window title, how do you find out what it is? Well, with Macro Scheduler there are several ways to find it. One is with the View System Windows Tool, which shows a list of all the windows currently available on the system, showing their captions and class names. Another is to use the Code Builders.

Here’s a video demonstrating these two methods. It also shows how I use a substring window match:

August 17, 2016

OnEvent – Dealing with Indeterminate Dialogs – [Repost]

Filed under: Automation,Scripting — Marcus Tettmar @ 3:16 pm

This is a repost. The original article is here.

Most of the time when we are automating a process we are able to predict the sequence of events. We are working with a deterministic process and a linear flow of actions. But there are occasions when things can happen during a process that we cannot predict precisely. E.g.:

  • We might know that a dialog or window may appear sometime during the process, but we cannot predict exactly when that will happen.
  • We may have a situation where after entering some data into a text field a dialog may, or may not appear.
  • There might be some other software running on the PC which randomly pops up an error box. And we need a way to clear that when it happens.

There are a number of ways we can deal with such situations.

Window Event Schedules

If you have a situation where a known window can randomly appear – say a known error box – which always has the same window title, the simplest approach is to use the Window Event schedule in the Advanced Scheduling properties. Simply create a macro which closes the box – perhaps all it has to do is press enter – and specify the window title under Advanced Options in the macro properties. Then whenever Macro Scheduler sees this window it will run the macro and clear it.

Synchronous Coding

In the case where a window may, or may not appear after entering some data into a field, say a data validation dialog, we could just deal with this after sending the text, in regular fashion – something like:

Send>the_data
Wait>0.5
IfWindowOpen>Verification Alert
  Press Enter
Endif

So we simply send the data then IF the verification window appears, close it. But what if you have hundreds of data fields to enter? Dealing with each one would involve a lot of extra code.

OnEvent Window Event Handlers

Another way is to use the OnEvent function to create an event handler in your main script. There are three types of window events that can be monitored with OnEvent:

  • WINDOW_OPEN – monitors a specific known window title, or window title substring
  • WINDOW_NOTOPEN – fires the event handler when specified window closes
  • WINDOW_NEWACTIVE – fires the event handler when there’s a new foreground window

OnEvent is used to create an “event handler” which is just a subroutine which will be executed whenever the event occurs. So, for example, using OnEvent you can tell the script to run a subroutine whenever a specified window appears, whenever that may be, while the rest of the script is executing.

So let’s say we are working with an application which could, at any time, pop up a warning box titled “Connection Error”, and this can be cleared just by pressing enter to hit the default OK button:

OnEvent>WINDOW_OPEN,Connection Error,2,CloseWarning

..
.. rest of script here
..

SRT>CloseWarning
  Press Enter
End>CloseWarning

Of course there are a whole load of other things you can do. We may have a window whose title is always the same but the content differs and we need to react according to the content. In this case our event handler subroutine would have extra code in it to determine which type of dialog it is. We might do this using the text capture functions to read the text from the dialog, or using Screen Image Recognition to check for the presence of an object.

Maintaining Focus

Here’s an idea for an event handler which ensures the target application is always focused. If another application should steal focus at any point during the running of the script, it just grabs focus back again. It’s always good advice to use SetFocus before sending text. But if you have thousands of Send commands and want to slim down your script and make it more readable you could use this approach. Anyway, it’s just an example:

.. your code here to start and focus the app you want to automate, e.g.:
Run>Notepad.exe
WaitWindowOpen>Untitled - Notepad

//assuming the target window is now focused, get it's handle and process name
Let>WIN_USEHANDLE=1
GetActiveWindow>MyWindowHandle,x,y
GetWindowProcess>MyWindowHandle,pid,MyProcessName
Let>WIN_USEHANDLE=0

//now set up the event that is fired when a new window appears
OnEvent>WINDOW_NEWACTIVE,0,0,HandleNewWindow

..
..
.. rest of script here
..
..

//When a new window that does not belong to our process appears,
// set focus back to our window
SRT>HandleNewWindow
  Let>WIN_USEHANDLE=1
  GetActiveWindow>hwnd,x,y
  GetWindowProcess>hwnd,pid,winProcName
  If>winProcName<>MyProcessName
     SetFocus>MyWindowHandle
  Endif
  Let>WIN_USEHANDLE=0
End>HandleNewWindow

Note how this code gets the window handle and process name of your target window. Then whenever a new window appears the HandleNewWindow subroutine is fired which gets the process name of the active window. If the process name of the new active window is not the process name of your target window (i.e. the new window belongs to some other application) it sets focus back to your original window.

I hope this gives you a useful introduction to OnEvent event handlers and how they can be used to run code at any point during the script in response to events. OnEvent can also be used to detect files, dialog events, dialog changes and keyboard and mouse actions. For further information please see OnEvent in the help file.

August 5, 2016

Capture Screen Text using OCR

Filed under: Automation,General,Scripting — Marcus Tettmar @ 3:37 pm

Here’s a way to get screen text from any application – even from an image – using OCR and a free open source tool called Tesseract.

First, you need to download and install Tesseract. You can get it here.

Tesseract is a command line utility. The most basic syntax is:

tesseract.exe input_image_file output_text_file

So you could call it from a Macro Scheduler script something like this:

//Capture screen to bmp file - you could instead capture only a window or use FindObject to get coordinates of a specific object
GetScreenRes>X2,Y2
ScreenCapture>0,0,X2,Y2,%SCRIPT_DIR%\screen.bmp

//run tesseract on the screen grab and output to temporary file
Let>RP_WAIT=1
Let>RP_WINDOWMODE=0
Run>"C:\Program Files (x86)\Tesseract-OCR\tesseract.exe" "%SCRIPT_DIR%\screen.bmp" "%SCRIPT_DIR%\tmp"

//read temporary file into memory and delete it
ReadFile>%SCRIPT_DIR%\tmp.txt,theText
DeleteFile>%SCRIPT_DIR%\tmp.txt

//Display the text in a message box 
MessageModal>theText

This example simply captures the entire screen. You probably wouldn’t normally want to do this. Instead you could capture a specific window:

//Capture just the Notepad Window
SetFocus>Untitled - Notepad
GetWindowPos>Untitled - Notepad,X1,Y1
GetWindowSize>Untitled - Notepad,w,h
ScreenCapture>X1,Y1,{%X1%+%w%},{%Y1%+%h%},%SCRIPT_DIR%\screen.bmp

Or even a specific object:

//capture just the editor portion of notepad ... 
SetFocus>Untitled - Notepad
GetWindowHandle>Untitled - Notepad,hWndParent
FindObject>hWndParent,Edit,,1,hWnd,X1,Y1,X2,Y2,result
ScreenCapture>X1,Y1,X2,Y2,%SCRIPT_DIR%\screen.bmp

Either way you then have a screen bitmap you can pass into Tesseract.

Once you’ve retrieved the text you would probably want to parse it, using e.g. RegEx. Here’s an article on a RegEx expression useful for parsing out data.

August 4, 2016

How to Run an Access Macro from Macro Scheduler

Filed under: Automation,General,Scripting — Marcus Tettmar @ 4:57 pm

Recently someone asked in the forums how to “Automatically Detect MS Office Install Location” so that they could run an Access macro.

Well, there are ways to get the path of an installed Office application, but it isn’t necessary in order to run an Access macro. This is a rehash of my forum answer:

You can run an Access macro via the command line using the /x switch. The ExecuteFile command lets you pass parameters. So you could just do this:

ExecuteFile>%USERDOCUMENTS_DIR%\MyDb.accdb,/x Macro1

This will open the DB and run macro “Macro1”. Note my DB is in my documents folder here so I’m just using USERDOCUMENTRS_DIR but this could be any path.

Here’s a list of other command line switches.

For more control you could use VBScript:

VBSTART
  Sub RunMacro(accessfile,macroname)
    dim accessApp
    set accessApp = createObject("Access.Application")
    accessApp.OpenCurrentDataBase(accessfile)
    'comment next line out if you don't want access to be visible
    accessApp.visible = true
    accessApp.DoCmd.RunMacro macroname
    'you can run a subroutine or function in module code instead if you want:
    'accessApp.run "routinename"
    accessApp.Quit
    set accessApp = nothing
  End Sub
VBEND

VBRun>RunMacro,%USERDOCUMENTS_DIR%\MyDb.accdb,Macro1

This gives you more control – you could make it invisible, and as you can see you could run VBA code instead if you want – or access any of the other methods. Anything you can do inside Access you can do here – by converting VBA to VBScript:

http://help.mjtnet.com/article/19-converting-office-vba-to-vbscript

But if you do really want to get the path, how about querying the mime-type in the registry:

RegistryReadKey>HKEY_CLASSES_ROOT,ms-access\shell\open\command,,accPath
ExtractFilePath>accPath,accPath

Enjoy!

July 12, 2016

Why it’s Good to Automate [Repost]

Filed under: Automation,General — Marcus Tettmar @ 3:51 pm

This is a repost from 2006.

One of the best ways to learn to use a software product fully is to try to automate it. Testers and automators have to learn the software’s interface really well, possibly better than the people who wrote it. Ok, the developers know the algorithms better than anyone else, but it is the person automating it who knows the ins and outs, pitfalls and quirks of the interface.

We all know how badly designed some Windows programs are. And in these days of fancy hi-res graphics and snazzy toolbar buttons it’s easy for the designers to forget about shortcut keys and keyboard navigation. The most productive way to use a PC is to forget the mouse and learn the keyboard shortcuts. You can get things done much more quickly. Yet even the most experienced Windows users don’t know half the keyboard shortcuts that exist in Windows (tips for keyboard navigation in Windows could be a post for another day).

Knowing these shortcuts makes automation so much easier and more reliable. Automating an application by sending mouse events and mouse clicks is unreliable and depends on the screen resolution never changing. Although you can use relative mouse coordinates, sooner or later something is going to change and the button you want to click is not in the place it was when the script was created.

The automation/test engineer is the one who figures out the keyboard shortcuts and finds the simplest, most reliable way of navigating an application. People who automate applications regularly have a good understanding of the different ways to move around Windows and Windows applications. Automated Software Testing can help find issues in the interface just from the process of building the automated test, even before the test script has been run. Building an automation routine for an application will help you find those missing or duplicated shortcut keys and other objects that can’t be driven by the keyboard.
Automators spend so much time fiddling with the software’s interface that they will often become more knowledgeable than the “power-users”. Testers also have the great advantage of being allowed to try unusual scenarios that developers never think about or are too busy to try. They are allowed to break things!

So it goes both ways. Find out the windows keyboard shortcuts and the hot-keys for the application you’re scripting and you can create a better script. Build an application with good keyboard support and your application can be automated more easily. If it can be automated easily it will be easy to use!

November 24, 2015

Finding HTML Attributes For Automating Web Sites

Filed under: Automation,Scripting,Web/Tech — Marcus Tettmar @ 1:00 pm

If you’re new to automating IE/websites with WebRecorder or the native Macro Scheduler IE functions you may be wondering how to determine which elements and attributes to use.

In this video I demonstrate how to use IE’s F12 key to invoke Developer Tools and use that to quickly find the elements we’re interested in and the attributes we need to use:

(You might want to click on the video toolbar to select a larger resolution size, view full screen or view on YouTube so that you can see the code).

February 6, 2015

Awesome Image Recognition

Filed under: Announcements,Automation — Marcus Tettmar @ 4:36 pm

We received this email today and I just had to share it …

Macro Scheduler image recognition just saved me 6 hours of mind numbing work and my program took me about 15 minutes to write, test, and start using. Such an AWESOME solution you provide.

Richard A. San Jose, CA USA

For more info on how to use Image Recognition see:
http://help.mjtnet.com/article/39-how-to-use-macro-schedulers-image-recognition-functions

« Newer PostsOlder Posts »