Automating Data Retrieval from Uncontrollable Web Site

Anything Really. Just keep it clean!

Moderators: Dorian (MJT support), JRL

Post Reply
User avatar
JRL
Automation Wizard
Posts: 3226
Joined: Mon Jan 10, 2005 6:22 pm
Location: Iowa

Automating Data Retrieval from Uncontrollable Web Site

Post by JRL » Wed Aug 03, 2016 1:04 pm

I don't do much web automation. Primarily because the sites I would automate belong to customers and they are fond of perpetually tweaking their web pages. I am once again tasked with automating a particular site that I have done several times in the past. Not once has a script for this site worked properly for more than a month. I currently have a script that worked in mid June that now fails on the first page. I'm not really here to whine. Just wanted to provide a little background.

How do you handle automating web pages that might change frequently?






Edit1: Fixed misspelling in subject line.

User avatar
Marcus Tettmar
Site Admin
Posts: 7003
Joined: Thu Sep 19, 2002 3:00 pm
Location: Dorset, UK
Contact:

Re: Automating Data Retreival from Uncontrollable Web Site

Post by Marcus Tettmar » Wed Aug 03, 2016 2:20 pm

About all you can do is make the script fail gracefully, and alert you (e.g. by email) that there's a problem. Since you cannot control if, when and how the page may change you can only adapt to it and modify the script when it does. So the important thing is just to make sure the script stops and alerts you when it e.g. fails to find an element etc.
Marcus Tettmar
http://mjtnet.com/blog/ | http://twitter.com/marcustettmar

Did you know we are now offering affordable monthly subscriptions for Macro Scheduler Standard?

User avatar
JRL
Automation Wizard
Posts: 3226
Joined: Mon Jan 10, 2005 6:22 pm
Location: Iowa

Re: Automating Data Retreival from Uncontrollable Web Site

Post by JRL » Wed Aug 03, 2016 3:02 pm

Marcus,
Thanks for the advice. Not what I wanted to hear but it still might be the best approach.

User avatar
CyberCitizen
Automation Wizard
Posts: 718
Joined: Sun Jun 20, 2004 7:06 am
Location: Adelaide, South Australia

Re: Automating Data Retreival from Uncontrollable Web Site

Post by CyberCitizen » Fri Oct 28, 2016 2:09 am

Tell them to stop changing their pages and charge them each time your having to modify your code etc.
FIREFIGHTER

User avatar
JRL
Automation Wizard
Posts: 3226
Joined: Mon Jan 10, 2005 6:22 pm
Location: Iowa

Re: Automating Data Retreival from Uncontrollable Web Site

Post by JRL » Wed Mar 01, 2017 5:22 pm

Cy,
That's funny. We're a $30 million dollar company reliant on their business to survive. They are a $30 billion dollar company. We don't tell them anything.

Had a pretty good run. The script worked for almost 6 months before they made a change to their website. AND most of the script still works. The biggest issue is the login. I have resolved the issue and I'm here to share the issue resolution. Apparently they are detecting timing and denying access if the user name and password are entered too quickly. I've replaced the lines:

Code: Select all

IEFormFill>%IE[0]%,{""},{"LoginForm"},{"userID"},{"username"},0,ie_res
IEFormFill>%IE[0]%,{""},{"LoginForm"},{"password"},pass,0,ie_res
IEFormFill>%IE[0]%,{""},{"LoginForm"},{"password"},{"userpassword"},0,ie_res
With the lines:

Code: Select all

Let>SK_DELAY=200
Send>username
Press Tab
Wait>0.5
Send>userpassword
Let>SK_DELAY=0
Not only did the IEFormFill> lines fail to login, The site also rescinded the password. After attempting the auto-login only one time, the password would no longer work even when logging in manually and a new password had to be requested. It only took 3 new passwords to figure this out. I'm assuming this is a system they've implemented in lieu of using a Captcha system... which will of course be next.

User avatar
CyberCitizen
Automation Wizard
Posts: 718
Joined: Sun Jun 20, 2004 7:06 am
Location: Adelaide, South Australia

Re: Automating Data Retrieval from Uncontrollable Web Site

Post by CyberCitizen » Wed Mar 01, 2017 10:04 pm

Yeah at least your around it now. I have seem some weird sites not like the username and password being submitted via form fills its like it doesn't register its been set and have had to revet to a similar model before.

Hopefully they don't go down the Google capcha mode as that is hard to automate.
FIREFIGHTER

Post Reply