spotting IP addresses and dates in textfiles?
Moderators: Dorian (MJT support), JRL
- Dorian (MJT support)
- Automation Wizard
- Posts: 1354
- Joined: Sun Nov 03, 2002 3:19 am
- Contact:
spotting IP addresses and dates in textfiles?
Does anyone have any pointers for an easy way of finding an IP address or a date in a textfile? The date could be in almost any format.
- Dorian (MJT support)
- Automation Wizard
- Posts: 1354
- Joined: Sun Nov 03, 2002 3:19 am
- Contact:
Hi Ernest, that's exactly what I'm doing. I'm writing a macro which will search a weblog for a very specific string, then searching backward based on those results.
I must admit I'm having a hard time getting my head round how to easily spot an IP address, or a date, and then of course all the different logfile formats complicate matters somewhat.
My experience with programming stops at Commodore64 basic (so macro sched is right up my street!) but it means I've forgotten most of the "rules".
I suppose what makes an IP address unique is it will always be 4 set of numbers separated by 3 dots. So maybe I'll write something based on that (?).
Haven't quite figured out spotting a date yet, as they seem to be in numerous different formats.
I must admit I'm having a hard time getting my head round how to easily spot an IP address, or a date, and then of course all the different logfile formats complicate matters somewhat.
My experience with programming stops at Commodore64 basic (so macro sched is right up my street!) but it means I've forgotten most of the "rules".
I suppose what makes an IP address unique is it will always be 4 set of numbers separated by 3 dots. So maybe I'll write something based on that (?).
Haven't quite figured out spotting a date yet, as they seem to be in numerous different formats.
have a look at the MSched 7.1 command reference.
(I'm forced to use MSched 6.0 so there could be a more sophisticated solution available from scratch with 7.1 ...)
Check for the context which is used with the IP e.g
... redirected from 170.143.124.65 blablabla ...
Sample:
Ernest
(I'm forced to use MSched 6.0 so there could be a more sophisticated solution available from scratch with 7.1 ...)
Check for the context which is used with the IP e.g
... redirected from 170.143.124.65 blablabla ...
Sample:
Code: Select all
Let>WordBeforeIP=from
Let>WordAfterIP=blablabla
Let>k=0
Label>start
Add>k,1
ReadLn>c:\output\web.log,k,line
If>line=##EOF##,finish
//Get the pos of the word in front of the IP
Position>%WordBeforeIP%,%line%,1,x
If>x=0, start
//Get the pos of the word which follows the IP
Position>%WordAfterIP%,%line%,1,y
//get rid of word+space
Add>x,5
//get rid of space
Sub>y,1
//get the IP length
Sub>y,%x%
//Get the IP
MidStr>line,%x%,%y%,IP
Label>finish
...
- Dorian (MJT support)
- Automation Wizard
- Posts: 1354
- Joined: Sun Nov 03, 2002 3:19 am
- Contact:
Hi Ernest, that's a good idea. It seems both my "sample" logfiles follow the visitor IP address with " - ", like this :
24.248.75.204 - - [05/Sep/2001:17:45:33 -0700]
1999-10-21 00:00:44 129.15.164.66 - W3SVC355 WEB8 208.141.56.223 GET /
.. so that would probably work pretty well. I suppose I'd have to "hard wire" for every different logfile type I can find.
24.248.75.204 - - [05/Sep/2001:17:45:33 -0700]
1999-10-21 00:00:44 129.15.164.66 - W3SVC355 WEB8 208.141.56.223 GET /
.. so that would probably work pretty well. I suppose I'd have to "hard wire" for every different logfile type I can find.
- Dorian (MJT support)
- Automation Wizard
- Posts: 1354
- Joined: Sun Nov 03, 2002 3:19 am
- Contact:
Hi Ernest,
I was just thinking, I can probably make a "one macro fits all" solution by searching for the first ".", then checking to make sure the next three dots are each no further than 4 character away.
That way, it will always find an IP address no matter where it is. I can then find the spaces either side and grab whatever is inbetween.
Although this is likely to be slower than looking for text either side, I don't think it'll be a problem because it'll only be searching the output from the large logfiles rather than the large logfiles themselves. It'll be searching maybe a few hundred lines at most.
Then I need to work out how to de-dupe the output. I'd say that part will probably be "easy but long and fiddly".
I was just thinking, I can probably make a "one macro fits all" solution by searching for the first ".", then checking to make sure the next three dots are each no further than 4 character away.
That way, it will always find an IP address no matter where it is. I can then find the spaces either side and grab whatever is inbetween.
Although this is likely to be slower than looking for text either side, I don't think it'll be a problem because it'll only be searching the output from the large logfiles rather than the large logfiles themselves. It'll be searching maybe a few hundred lines at most.
Then I need to work out how to de-dupe the output. I'd say that part will probably be "easy but long and fiddly".