Quickly searching large text files?

Technical support and scripting issues

Moderators: Dorian (MJT support), JRL

Post Reply
horoscopes2002

Quickly searching large text files?

Post by horoscopes2002 » Sat Nov 16, 2002 6:20 am

Hi, I just upgraded to the (so far) rather impressive 7.1 Pro, and I need to find a *fast* way of searching large text files for particular strings.

I am usign a variation of a macro I found in the archive, but it takes almost 10 minutes to search a 1.2MB textfile - not much of a solution when some of the logs are 60MB+

What is the best way of duoing this? Is there any kind of grep command?

Below is the script I am using. It searches each line for a 9 character string, and if it's successful it writes the line to another file then goes on to the next line. If it doesn't find the string, it goes on to the next line anyway, and will do so until it reaches the end of file.....

let>filename=C:\My Documents\logfiles\2002-11-15.log
message>SEARCHING FOR RECEIPTS.....

Let>l=1
label>q
Label>nextline
Let>l=l+1
ReadLn>filename,l,Lotto
If>Lotto=##EOF##,End
Length>Lotto,llen
Let>K=44 (STARTED AT 44 BECAUSE TEXT STRING I AM LOOKING FOR IS NEVER LESS THAN 44 CHARACTERS IN)
Label>Start
Let>K=K+1
If>K>llen,nextline
MidStr>Lotto,K,9,num
If>num=cbreceipt,dash
Goto>Start
Label>dash
Message>Receipt found in line %l%
WriteLn>c:\my documents\output.txt,result,Line %l% : %Lotto%
Goto>Start
Label>End
message>finished

User avatar
Dorian (MJT support)
Automation Wizard
Posts: 1348
Joined: Sun Nov 03, 2002 3:19 am
Contact:

Post by Dorian (MJT support) » Sat Nov 16, 2002 6:56 am

Hmm. The board seemed to forget I am me. Just posting here so I can check the email notify box this time around.

aquatech
Site Admin
Posts: 8
Joined: Fri Sep 20, 2002 12:23 am
Location: Australia
Contact:

Post by aquatech » Sat Nov 16, 2002 11:22 am

G'day Horoscopes,

My suggestion, if you have the Pro version, would be to use some VBScript code, hook the FileSystemObject library, and use that to do your searching.

I think there is an example in the archives/scripts that does something like that.

HTH.

User avatar
Dorian (MJT support)
Automation Wizard
Posts: 1348
Joined: Sun Nov 03, 2002 3:19 am
Contact:

Post by Dorian (MJT support) » Sat Nov 16, 2002 6:54 pm

Thanks Tim. I'll have to look through the archives and see if I can find it. I am totally lost with VB, even though I actually bought the developer kit 2 yrs ago I haven't had time to learn it. :(

Ernest

Post by Ernest » Mon Nov 18, 2002 9:22 am

Hi,
you're right. The perfect tool is "grep". Use Grep for windows from "cygwin" (check http://www.google.com for it).

Run program>%COMSPEC% /c grep [parameters] "StringOrPatternToFind" YourPath\YourInputFile(s).* > YourOutputFiles(s).txt

Afterwards check the output/result within MSched.

Ciao :)
Ernest



[/url]

User avatar
Dorian (MJT support)
Automation Wizard
Posts: 1348
Joined: Sun Nov 03, 2002 3:19 am
Contact:

Post by Dorian (MJT support) » Mon Nov 18, 2002 5:48 pm

Cheers Ernest,

I found a couple of windows grep tools as per your suggestion - now to test them. .......

User avatar
Bob Hansen
Automation Wizard
Posts: 2475
Joined: Tue Sep 24, 2002 3:47 am
Location: Salem, New Hampshire, US
Contact:

Post by Bob Hansen » Thu Nov 21, 2002 3:51 am

Don't forget the old standard DOS commands! FIND still works to search for a string. Remember that FIND is Case Sensitive. The switch /c gives a count of lines found with the string.

This routine redirects a single line output to a text file. The result ends up on the second line of that text file.

By combining command /c with Run Program, many of these old commands can be run without using a batch file.


Here is a short macro that prompts for a string and lets you know if it is found.
==============================
//Find a string in a file
Let>Source=c:\temp\yourfile.txt

//Prompt the user for string
Input>Seek,Enter the case sensitive string to search for:

// FIND /c returns a line with the number of lines that had the string. The output is redirected to the second line in a text file. Remember to include the string inside quotes.
Run Program>command.com /c find /c "%Seek%" %Source% > c:\temp\found.txt

//Parse the second line to find the position of the count number
ReadLn>c:\temp\found.txt,2,found
Position>t:,%found%,1,StartPos
Let>start=%StartPos%+3

//If the number is anything other than 0, then the string was found
MidStr>%found%,%start%,1,Value
If>%Value%>0,GotIt
Message>%Seek% Not Found.%CRLF%%CRLF%Remember, search is CASE Sensitive.
Goto>End

Label>GotIt
Message>Found %Seek% !

Label>End
===================================

I know you had a solution, but there is usually more than one way to solve a problem. This is just another approach. Good luck.

User avatar
Dorian (MJT support)
Automation Wizard
Posts: 1348
Joined: Sun Nov 03, 2002 3:19 am
Contact:

Post by Dorian (MJT support) » Thu Nov 21, 2002 6:35 am

Hi Bob,

" Run Program>command.com " , that's pretty clever! In fact, I think it's even better than the solution I came up with. I found I could use a macro to write a batch file which then searched my text file and wrote the output to another file. The batch file took only a second or so to search a 176MB text file.

The thread is here :
http://www.mjtnet.com/forum/viewtopic.php?p=224#224

I think using your solution though, I may be able to achieve the same thing with less stages. I'll play with them both and see what happens.

I'm really starting to get somewhere with this now, making a little more progress every day fitting it inbetween my "normal" jobs. :D

User avatar
Bob Hansen
Automation Wizard
Posts: 2475
Joined: Tue Sep 24, 2002 3:47 am
Location: Salem, New Hampshire, US
Contact:

Post by Bob Hansen » Thu Nov 21, 2002 6:59 am

I used to call batch files from Macro Scheduler but then I learned the advantage of the log files.

So I have gradually changed all my batch files to macros to get the logging tools. Having a time stamp for every line in the batch file execution has been invaluable in troubleshooting.

And Run Program>command really was a big discovery to help make that easier.

And thanks to Ernest and your results, I will now have to look into the "grep" that I am not familiar with.

There's always another way to do it!

Post Reply
Sign up to our newsletter for free automation tips, tricks & discounts