FAQFAQ   SearchSearch   MemberlistMemberlist   UsergroupsUsergroups   RegisterRegister   Add to FavoritesAdd to Favorites
 ProfileProfile   Log in to check your private messagesLog in to check your private messages   Log inLog in    RSS Get RSS Feed  


Freebie OCR for screenshots and TIFF/BMP Files

 
Post new topic   Reply to topic    Macro Scheduler and Windows Automation Forum Index -> General Discussion
View previous topic :: View next topic  
Author Message
gpulawski
Newbie


Joined: 10 Sep 2007
Posts: 6

Reputation: 120
votes: 1
Earn Points, Win a T-Shirt

PostPosted: Mon Oct 08, 2007 12:31 am    Post subject: Freebie OCR for screenshots and TIFF/BMP Files Reply with quote

I looked into using a couple of the commercial SDK's for some limited OCR of screenshot bitmaps for instance to verify information in a window and to check some image files for certain text content. These were quite expensive and required royalty fees. Then, I recalled that Microsoft Office has built in OCR. It is in Office Tools for version 2003 onward -- in Microsoft Office Document Imaging. START>ALL PROGRAMS>MICROSOFT OFFICE>MICROSOFT OFFICE TOOLS>. The component is called MODI and lives in MDIVWCTL.DLL. It is scriptable!!!

MODI understands TIFF, multi-page TIFF, and BMP. You can search MSDN under MODI for objects and methods.

Here's some example code to open a TIF or BMP file, OCR the first page, and then put the text result into a single string in a message box.



MODI also has its own search methods, document viewer, etc. I think Office 2007 and Vista do not install it by default like 2003 does. (I don't own Vista.)

Did this message help you? If so please reward the poster with Reputation Points? Reward Points
Back to top
View user's profile Send private message
mtettmar
Site Admin


Joined: 19 Sep 2002
Posts: 4217
Location: Dorset, UK
Reputation: 621
votes: 28
Earn Points, Win a T-Shirt

PostPosted: Mon Oct 08, 2007 7:37 am    Post subject: Reply with quote

Hi,

Thanks for this, this is awesome. MODI (Microsoft Office Document Imaging) is installed with my Office 2007 installation under Vista.

Here's my version of your code that uses OCR to get the text of the active window. Uncomment the comment block to OCR the entire screen:


_________________
Regards,
Marcus Tettmar
http://mjtnet.com/blog/ | http://twitter.com/marcustettmar
Please do not email/PM me for private support - post to the forum so that everyone benefits. For private support please send email via our web site.

Did this message help you? If so please reward the poster with Reputation Points? Reward Points
Back to top
View user's profile Send private message Visit poster's website AIM Address Yahoo Messenger MSN Messenger
gpulawski
Newbie


Joined: 10 Sep 2007
Posts: 6

Reputation: 120
votes: 1
Earn Points, Win a T-Shirt

PostPosted: Tue Oct 09, 2007 2:04 am    Post subject: Reply with quote

Yup, I can think of a lot of possibilities. It doesn't compare in accuracy to Omnipage or ABBYY but it's good enough for a lot of interesting Automation/MacroScheduler ideas. (And just about everybody has it installed already.)

GJP

Did this message help you? If so please reward the poster with Reputation Points? Reward Points
Back to top
View user's profile Send private message
pzelenka
Newbie


Joined: 25 Nov 2002
Posts: 11

Reputation: 100
Earn Points, Win a T-Shirt

PostPosted: Wed Oct 24, 2007 3:46 pm    Post subject: Reply with quote

This is very cool.

I used this in a script to process Windows exception dialog boxes with text that was otherwise impossible to get. When the exception text is evaluated, the script can then be appropriately branched.

Thank you very much!!

Did this message help you? If so please reward the poster with Reputation Points? Reward Points
Back to top
View user's profile Send private message
mtettmar
Site Admin


Joined: 19 Sep 2002
Posts: 4217
Location: Dorset, UK
Reputation: 621
votes: 28
Earn Points, Win a T-Shirt

PostPosted: Wed Oct 24, 2007 4:38 pm    Post subject: Reply with quote

Just in case you weren't aware - with most windows message boxes if you hit CTRL+C the content of the dialog is copied to the clipboard.
_________________
Regards,
Marcus Tettmar
http://mjtnet.com/blog/ | http://twitter.com/marcustettmar
Please do not email/PM me for private support - post to the forum so that everyone benefits. For private support please send email via our web site.

Did this message help you? If so please reward the poster with Reputation Points? Reward Points
Back to top
View user's profile Send private message Visit poster's website AIM Address Yahoo Messenger MSN Messenger
pzelenka
Newbie


Joined: 25 Nov 2002
Posts: 11

Reputation: 100
Earn Points, Win a T-Shirt

PostPosted: Wed Oct 24, 2007 5:42 pm    Post subject: Reply with quote

I gave the CTRL+C a try, also efforts using GetWindowText and GetControlText approaches. all without success.

When there are exception issues, you will be presented with warning dialog boxes with differing descriptions, usually dependent upon the module, routine or dictionary. The most pressing concern is the proper handling of the Warning dialog boxes.

In the case of the Dialog boxes, the window title is available, but the descriptive content was out of reach, until I used the screen capture and OCR approach.

Thanks

Did this message help you? If so please reward the poster with Reputation Points? Reward Points
Back to top
View user's profile Send private message
Semper
Newbie


Joined: 25 Feb 2008
Posts: 6

Reputation: 100
Earn Points, Win a T-Shirt

PostPosted: Wed Feb 27, 2008 11:11 am    Post subject: Reply with quote

Hi,

any idea why i can't run the MODI script wihtout
getting error OCR:bad language at
line miDoc.Images(0).OCR

Thanks for the feedback.

Did this message help you? If so please reward the poster with Reputation Points? Reward Points
Back to top
View user's profile Send private message
Semper
Newbie


Joined: 25 Feb 2008
Posts: 6

Reputation: 100
Earn Points, Win a T-Shirt

PostPosted: Sun Mar 02, 2008 11:56 am    Post subject: Reply with quote

I figured it out. The language parameter Id was missing since i'm on a
multilingual machine

miDoc.images(0).OCR(9) did the job. Very Happy

Did this message help you? If so please reward the poster with Reputation Points? Reward Points
Back to top
View user's profile Send private message
rullbandspelare
Pro Scripter


Joined: 23 Mar 2004
Posts: 80

Reputation: 100
Earn Points, Win a T-Shirt

PostPosted: Sat Mar 15, 2008 10:40 pm    Post subject: Reply with quote

I dont have Office 2007 and was looking for OCR. I found this one wich is free to use.
http://www.topocr.com/
It would be interesting if someone could compare the quality of result with the previously posted example for Office 2007.

for example:



I was also looking for Barcode reader OCR. And found this:
http://www.metois.com/Eymbarcode/index.htm
This free demo gives a splash screen every now and then but works very well.
Has someone found anything better?

Thanks!

Did this message help you? If so please reward the poster with Reputation Points? Reward Points
Back to top
View user's profile Send private message
Display posts from previous:    View previous topic :: View next topic  
Post new topic   Reply to topic    Macro Scheduler and Windows Automation Forum Index -> General Discussion All times are GMT
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


Powered by phpBB © 2001, 2005 phpBB Group