Trying to write a better Scraping Macro...

Technical support and scripting issues

Moderators: Dorian (MJT support), JRL

Post Reply
rjw524
Pro Scripter
Posts: 104
Joined: Wed May 09, 2012 9:45 pm
Location: Michigan

Trying to write a better Scraping Macro...

Post by rjw524 » Tue Sep 28, 2021 5:28 pm

Hello,

I'm a beginner level user and I wrote a script to extract certain job listing data from Indeed.

Basically, it's a copy paste macro that copies the page full of data from Indeed and pastes it into Excel.

Then, in Excel (where I'm a much stronger user) I have a workbook that uses a number string functions to tease out the data I need.

Basically, all I need is:
-Job Title,
-Hiring Company,
-Location,
-Age of the Posting

For a number of reasons, this is fairly inefficient. Especially if changes to the site's layout are made.

I figured Macro Scheduler HAS to possess the functionality for a simple extraction of this type. But I haven't been able to find what I need in the forums. (I'm sure it's there. It is more than likely a shortcoming on my end and not knowing enough of what to look for).

But if anyone can help me get started or point me to some examples to help get started, I'd really appreciate it.

Here's a link of a typical search I would run on Indeed for anyone willing to show me an example:

https://www.indeed.com/jobs?as_and&as_p ... faf71e7e8b


Thanks a lot, any help or direction would be appreciated...

RJ

User avatar
Grovkillen
Automation Wizard
Posts: 1128
Joined: Fri Aug 10, 2012 2:38 pm
Location: Bräcke, Sweden
Contact:

Re: Trying to write a better Scraping Macro...

Post by Grovkillen » Tue Sep 28, 2021 7:41 pm

For scraping I tend to go for the "open developer console" - "paste some JavaScript code" - "the script fetch all the elements" - "parse that data" - "save it as a JSON file locally" and from there I use the MS JSON parser. This trend to be fairly unaffected of design changes since the sites rarely change the back bone and only change the layout.
Let>ME=%Script%

Running: 15.0.27
version history

User avatar
Grovkillen
Automation Wizard
Posts: 1128
Joined: Fri Aug 10, 2012 2:38 pm
Location: Bräcke, Sweden
Contact:

Re: Trying to write a better Scraping Macro...

Post by Grovkillen » Tue Sep 28, 2021 8:12 pm

Just an example, the red sections in this source seems to have the info you want.

<td><div class="heading6 tapItem-gutter result-footer"><div class="job-snippet"><ul style="list-style-type:circle;margin-top: 0px;margin-bottom: 0px;padding-left:20px;">
<li style="margin-bottom:0px;">5-7 years of relevant experience required (accounting <b>or</b> finance degree).</li>
<li>Drive continuous improvements in the company’s <b>accounting</b> processes.</li>
</ul></div><span class="date">30+ days ago</span><span class="result-link-bar-separator">·</span><button type="button" class="sl resultLink more_links_button" aria-expanded="false">More...</button></div><div class="tab-container"><div class="more-links-container result-tab" role="presentation"><div class="more_links"><button type="button" class="close-button" title="Close"></button><ul><li><span class="mat">View all <a href="/q-Cloudcannabis-l-Troy,-MI-jobs.html">Cloudcannabis jobs in Troy, MI</a> - <a href="/l-Troy,-MI-jobs.html">Troy jobs</a></span></li><li><span class="mat">Salary Search: <a href="/career/accountant/salaries/48083--MI?campaignid=serp-more&amp;fromjk=33d50f0645ff2c78&amp;from=serp-more">Bookkeeping/Accountant salaries in Troy, MI</a></span></li></ul></div></div></div></td>
Let>ME=%Script%

Running: 15.0.27
version history

User avatar
Grovkillen
Automation Wizard
Posts: 1128
Joined: Fri Aug 10, 2012 2:38 pm
Location: Bräcke, Sweden
Contact:

Re: Trying to write a better Scraping Macro...

Post by Grovkillen » Tue Sep 28, 2021 8:28 pm

And here's another example:

<div class="slider_container"><div class="slider_list"><div class="slider_item"><div class="job_seen_beacon"><table class="jobCard_mainContent" cellpadding="0" cellspacing="0" role="presentation"><tbody><tr><td class="resultContent"><div class="heading4 color-text-primary singleLineTitle tapItem-gutter"><h2 class="jobTitle jobTitle-color-purple"><span title="General Accountant">General Accountant</span></h2></div><div class="heading6 company_location tapItem-gutter"><pre><span class="companyName"><a data-tn-element="companyName" class="turnstileLink companyOverviewLink" target="_blank" href="/cmp/Marriott-International,-Inc." rel="noopener">Marriott International, Inc</a></span><span class="ratingsDisplay withRatingLink"><a data-tn-variant="cmplinktst2" class="ratingLink" target="_blank" href="/cmp/Marriott-International,-Inc./reviews" title="Marriott International, Inc reviews" aria-label="Company rating 4.1 out of 5 stars" rel="noopener"><span class="ratingNumber" aria-label="4.1 of stars rating" role="img"><span aria-hidden="true">4.1</span><svg width="12" height="12" class="starIcon" aria-hidden="true" viewBox="0 0 16 16" fill="none" xmlns="http://www.w3.org/2000/svg"><path d="M8 12.8709L12.4542 15.5593C12.7807 15.7563 13.1835 15.4636 13.0968 15.0922L11.9148 10.0254L15.8505 6.61581C16.1388 6.36608 15.9847 5.89257 15.6047 5.86033L10.423 5.42072L8.39696 0.640342C8.24839 0.289808 7.7516 0.289808 7.60303 0.640341L5.57696 5.42072L0.395297 5.86033C0.015274 5.89257 -0.13882 6.36608 0.149443 6.61581L4.0852 10.0254L2.90318 15.0922C2.81653 15.4636 3.21932 15.7563 3.54584 15.5593L8 12.8709Z" fill="#767676"></path></svg></span></a></span><div class="companyLocation">Detroit, MI<!-- --> <span class="companyLocation--extras">(<!-- -->Downtown area<!-- -->)</span></div></pre></div><div class="heading6 error-text tapItem-gutter"></div></td></tr></tbody></table><table class="jobCardShelfContainer" role="presentation"><tbody><tr class="jobCardShelf"></tr><tr class="underShelfFooter"><td><div class="heading6 tapItem-gutter result-footer"><div class="job-snippet"><ul style="list-style-type:circle;margin-top: 0px;margin-bottom: 0px;padding-left:20px;">
<li>Prepare, maintain, audit, and distribute statistical, financial, <b>accounting</b>, auditing, <b>or</b> payroll reports and tables.</li>
</ul></div><span class="date">13 days ago</span><span class="result-link-bar-separator">·</span><button type="button" class="sl resultLink more_links_button" aria-expanded="false">More...</button></div><div class="tab-container"><div class="more-links-container result-tab" role="presentation"><div class="more_links"><button type="button" class="close-button" title="Close"></button><ul><li><span class="mat">View all <a href="/jobs?q=Marriott+International,+Inc&amp;l=Detroit,+MI&amp;nc=jasx">Marriott International, Inc jobs in Detroit, MI</a> - <a href="/l-Detroit,-MI-jobs.html">Detroit jobs</a></span></li><li><span class="mat">Salary Search: <a href="/career/general-accountant/salaries/Detroit--MI?campaignid=serp-more&amp;fromjk=5dd16941378d7127&amp;from=serp-more">General Accountant salaries in Detroit, MI</a></span></li><li><span class="mat">See popular <a href="/cmp/Marriott-International,-Inc./faq">questions &amp; answers about Marriott International, Inc</a></span></li></ul></div></div></div></td></tr></tbody></table><div aria-live="polite"></div></div></div><div class="slider_sub_item"></div></div></div>
Last edited by Grovkillen on Tue Sep 28, 2021 8:34 pm, edited 1 time in total.
Let>ME=%Script%

Running: 15.0.27
version history

User avatar
Grovkillen
Automation Wizard
Posts: 1128
Joined: Fri Aug 10, 2012 2:38 pm
Location: Bräcke, Sweden
Contact:

Re: Trying to write a better Scraping Macro...

Post by Grovkillen » Tue Sep 28, 2021 8:30 pm

As you see there's a span with a class "date" and those a href's are pointing to a "jobs" and a "career" respectively. Or perhaps look for "jobs in" vs "salaries in" vs "days ago" in the inner text?

That's where I'd start.
Let>ME=%Script%

Running: 15.0.27
version history

rjw524
Pro Scripter
Posts: 104
Joined: Wed May 09, 2012 9:45 pm
Location: Michigan

Re: Trying to write a better Scraping Macro...

Post by rjw524 » Tue Oct 05, 2021 2:37 pm

Hey Grovkillen,

Thank you SO much for this! I really appreciate the detailed example you laid out.

I'm going to give this a shot.

RJ

Post Reply
cron
Sign up to our newsletter for free automation tips, tricks & discounts