August 5, 2016

Capture Screen Text using OCR

Filed under: Automation,General,Scripting — Marcus Tettmar @ 3:37 pm

Here’s a way to get screen text from any application – even from an image – using OCR and a free open source tool called Tesseract.

First, you need to download and install Tesseract. You can get it here.

Tesseract is a command line utility. The most basic syntax is:

tesseract.exe input_image_file output_text_file

So you could call it from a Macro Scheduler script something like this:

//Capture screen to bmp file - you could instead capture only a window or use FindObject to get coordinates of a specific object
GetScreenRes>X2,Y2
ScreenCapture>0,0,X2,Y2,%SCRIPT_DIR%\screen.bmp

//run tesseract on the screen grab and output to temporary file
Let>RP_WAIT=1
Let>RP_WINDOWMODE=0
Run>"C:\Program Files (x86)\Tesseract-OCR\tesseract.exe" "%SCRIPT_DIR%\screen.bmp" "%SCRIPT_DIR%\tmp"

//read temporary file into memory and delete it
ReadFile>%SCRIPT_DIR%\tmp.txt,theText
DeleteFile>%SCRIPT_DIR%\tmp.txt

//Display the text in a message box 
MessageModal>theText

This example simply captures the entire screen. You probably wouldn’t normally want to do this. Instead you could capture a specific window:

//Capture just the Notepad Window
SetFocus>Untitled - Notepad
GetWindowPos>Untitled - Notepad,X1,Y1
GetWindowSize>Untitled - Notepad,w,h
ScreenCapture>X1,Y1,{%X1%+%w%},{%Y1%+%h%},%SCRIPT_DIR%\screen.bmp

Or even a specific object:

//capture just the editor portion of notepad ... 
SetFocus>Untitled - Notepad
GetWindowHandle>Untitled - Notepad,hWndParent
FindObject>hWndParent,Edit,,1,hWnd,X1,Y1,X2,Y2,result
ScreenCapture>X1,Y1,X2,Y2,%SCRIPT_DIR%\screen.bmp

Either way you then have a screen bitmap you can pass into Tesseract.

Once you’ve retrieved the text you would probably want to parse it, using e.g. RegEx. Here’s an article on a RegEx expression useful for parsing out data.