{"id":210,"date":"2007-10-08T07:56:01","date_gmt":"2007-10-08T07:56:01","guid":{"rendered":"http:\/\/www.mjtnet.com\/blog\/2007\/10\/08\/screen-ocr-to-retrieve-otherwise-undetectable-text\/"},"modified":"2007-10-08T07:56:01","modified_gmt":"2007-10-08T07:56:01","slug":"screen-ocr-to-retrieve-otherwise-undetectable-text","status":"publish","type":"post","link":"https:\/\/www.mjtnet.com\/blog\/2007\/10\/08\/screen-ocr-to-retrieve-otherwise-undetectable-text\/","title":{"rendered":"Screen OCR to Retrieve Otherwise Undetectable Text"},"content":{"rendered":"<p>Some while ago I wrote <a href=\"http:\/\/www.mjtnet.com\/blog\/2006\/06\/06\/screen-ocr-recognising-graphical-text\/\">this post<\/a> about how to script a tool called Textract to perform OCR against the screen.   Well <a href=\"http:\/\/www.mjtnet.com\/usergroup\/viewtopic.php?t=4145\">gpulawski has just posted this tip<\/a> in the forums pointing out that Microsoft Office 2003\/2007 comes with something called MODI (Microsoft Office Document Imaging) which can OCR a bitmap or TIFF file and is scriptable.  Very cool.<\/p>\n<p>We can use MODI transparently in a Macro Scheduler script to use OCR to extract text from a Window.  I&#8217;ve <a href=\"http:\/\/www.mjtnet.com\/usergroup\/viewtopic.php?t=4147\">posted an example here<\/a>.<\/p>\n<p>This is really useful if you have a window that doesn&#8217;t expose text as text objects and you can&#8217;t get at the text any other way (e.g. via the clipboard or controls).  You may want to check for the existence of a certain string in a window, or wait for a particular word or phrase to appear in order to determine when a process has completed.  If the control doesn&#8217;t expose the text and the text cannot be copied to the clipboard, OCR may be the solution.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Some while ago I wrote this post about how to script a tool called Textract to perform OCR against the screen. Well gpulawski has just posted this tip in the forums pointing out that Microsoft Office 2003\/2007 comes with something called MODI (Microsoft Office Document Imaging) which can OCR a bitmap or TIFF file and [&hellip;]<\/p>\n","protected":false},"author":2,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":[],"categories":[4,6],"tags":[],"jetpack_featured_media_url":"","_links":{"self":[{"href":"https:\/\/www.mjtnet.com\/blog\/wp-json\/wp\/v2\/posts\/210"}],"collection":[{"href":"https:\/\/www.mjtnet.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.mjtnet.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.mjtnet.com\/blog\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/www.mjtnet.com\/blog\/wp-json\/wp\/v2\/comments?post=210"}],"version-history":[{"count":0,"href":"https:\/\/www.mjtnet.com\/blog\/wp-json\/wp\/v2\/posts\/210\/revisions"}],"wp:attachment":[{"href":"https:\/\/www.mjtnet.com\/blog\/wp-json\/wp\/v2\/media?parent=210"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.mjtnet.com\/blog\/wp-json\/wp\/v2\/categories?post=210"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.mjtnet.com\/blog\/wp-json\/wp\/v2\/tags?post=210"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}