text recognition

Ideas for new features & functions

Moderators: Dorian (MJT support), JRL

Post Reply
macroman
Pro Scripter
Posts: 91
Joined: Mon Jun 02, 2014 5:32 am

text recognition

Post by macroman » Sun Jun 15, 2014 11:27 pm

I tried doing text recognition with FireFox and it doesn't work period... image recognition have a better successful rate ratio... how can I just find a certain printed text on a site and then just execute my if then there... it's just not possible in the current version...

User avatar
CyberCitizen
Automation Wizard
Posts: 721
Joined: Sun Jun 20, 2004 7:06 am
Location: Adelaide, South Australia

Re: text recognition

Post by CyberCitizen » Mon Jun 16, 2014 3:58 am

Marcus Tettmar wrote:The GetText functions work with many windows apps - it will work with any windows app which uses the Windows TextOut set of functions to render text. But not all apps do so it won't work with all apps. E.g. it doesn't currently read text from Chrome which uses proprietary text rendering methods. I don't know about FireFox but it could be the same. If you use the wizard and point it at the e.g. the desktop, notepad and explorer you'll see that it returns text. So it's a case of suck it and see. Not all apps produce text as far as Windows is concerned. This topic has been dealt with on my blog a number of times:

http://www.mjtnet.com/blog/2009/01/23/s ... scheduler/
FIREFIGHTER

mightycpa
Automation Wizard
Posts: 343
Joined: Mon Jan 12, 2004 4:07 pm
Location: Vienna, VA

Re: text recognition

Post by mightycpa » Sat Oct 18, 2014 12:59 am

I had the same problem, but in IE, because of a particular application running some kind of Java or something. I found two solutions for Firefox:

Tesseract OCR. In your script you grab an image of the text, OCR it to a file, and read the file. If the text is too small, do a Ctrl + until it gets big enough. The only proviso is that the text must be black and the background white for the best character recognition.

In Firefox, click Options > Content > Colors, select black and white, and uncheck the box that allows web page to use its colors.

Then you're in business.
"A facility for quotation covers the absence of original thought." - Lord Peter Wimsey

mightycpa
Automation Wizard
Posts: 343
Joined: Mon Jan 12, 2004 4:07 pm
Location: Vienna, VA

Re: text recognition

Post by mightycpa » Sat Oct 18, 2014 11:58 pm

I have two other suggestions for you.

Ctrl-a Ctrl-c which will copy all visible text in the window to the clipboard. You can parse through that and do what you want.

Or

Ctrl-u which will expose the page source in a separate window. That way, if you need to search both text and tags, that's in there too.
"A facility for quotation covers the absence of original thought." - Lord Peter Wimsey

Post Reply
cron
Sign up to our newsletter for free automation tips, tricks & discounts