Submit Your Requirement
Scroll down to discover

How to Use Web Scraper Chrome Extension to Extract Data

April 4, 2016Category : Blog Web Scraping
How to Use Web Scraper Chrome Extension to Extract Data

This post is about DIY web scraping tools. If you are looking for a fully customizable web scraping solution, you can add your project on CrawlBoard.

How to Use Web Scraper Chrome Extension to Extract Data

Web scraping is becoming a vital ingredient in business and marketing planning regardless of the industry. There are several ways to crawl the web for useful data depending on your requirements and budget. Did you know that your favourite web browser could also act as a great web scraping tool

You can install the Web Scraper extension from the chrome web store to make it an easy-to-use data scraping tool. The best part is, you can stay in the comfort zone of your browser while the scraping happens. This doesn’t demand much technical skills, which makes it a good option when you need to do some quick data scraping. Let’s get started with the tutorial on how to use web scraper chrome extension to extract data.

About the Web Scraper Chrome Extension

Web Scraper is a web data extractor extension for chrome browsers made exclusively for web data scraping. You can set up a plan (sitemap) on how to navigate a website and specify the data to be extracted. The scraper will traverse the website according to the setup and extract the relevant data. It lets you export the extracted data to CSV. Multiple pages can be scraped using the tool, making it even more powerful. It can even extract data from dynamic pages that use Javascript and Ajax.

What You Need

  • Google Chrome browser
  • A working internet connection

A. Installation and setup

  • webscraper chrome extension by using link
  • For web scraper chrome extension download click on “Add”
    Once this is done, you are ready to start scraping any website using your chrome browser. You just need to learn how to perform the scraping, which we are about to explain.

B. The Method

After installation, open the Google Chrome developer tools by pressing F12. (You can alternatively right-click on the screen and select inspect element). In the developer tools, you will find a new tab named ‘Web scraper’ as shown in the screenshot below.

Now let’s see how to use this on a live web page. We will use a site called www.awesomegifs.com for this tutorial. This site contains gif images and we will crawl these image URLs using our web scraper.

Step 1: Creating a Sitemap

  • Go to https://www.awesomegifs.com/
  • Open developer tools by right-clicking anywhere on the screen and then selecting inspect
  • Click on the web scraper tab in developer tools
  • Click on ‘create new sitemap’ and then select ‘create sitemap’
  • Give the sitemap a name and enter the URL of the site in the start URL field.
  • Click on ‘Create Sitemap’

To crawl multiple pages from a website, we need to understand the pagination structure of that site. You can easily do that by clicking the ‘Next’ button a few times from the homepage. Doing this on Awesomegifs.com revealed that the pages are structured as https://awesomegifs.com/page/1/, https://awesomegifs.com/page/2/, and so on. To switch to a different page, you only have to change the number at the end of this URL. Now, we need the scraper to do this automatically.

To do this, create a new sitemap with the start URL as https://awesomegifs.com/page/[001-125]. The scraper will now open the URL repeatedly while incrementing the final value each time. This means the scraper will open pages starting from 1 to 125 and crawl the elements that we require from each page.

Step 2: Scraping Elements

Every time the scraper opens a page from the site, we need to extract some elements. In this case, it’s the gif image URLs. First, you have to find the CSS selector matching the images. You can find the CSS selector by looking at the source file of the web page (CTRL+U). An easier way is to use the selector tool to click and select any element on the screen. Click on the Sitemap that you just created, click on ‘Add new selector’. In the selector id field, give the selector a name. In the type field, you can select the type of data that you want to be extracted. Click on the select button and select any element on the web page that you want to be extracted. When you are done selecting, click on ‘Done selecting’. It’s easy as clicking on an icon with the mouse. You can check the ‘multiple’ checkbox to indicate that the element you want can be present multiple times on the page and that you want each instance of it to be scrapped.

Now you can save the selector if everything looks good. To start the scraping process, just click on the sitemap tab and select ‘Scrape’. A new window will pop up which will visit each page in the loop and crawl the required data. If you want to stop the data scraping process in between, just close this window and you will have the data that was extracted till then.

Once you stop scraping, go to the sitemap tab to browse the extracted data or export it to a CSV file. The only downside of such data extraction software is that you have to manually perform the scraping every time since it doesn’t have many automation features built-in. 

If you want to crawl data on a large scale, it is better to go with a data scraping service instead of such free web scraper chrome extension data extraction tools like these. With the second part of this series, we will show you how to make a MySQL database using the extracted data. Stay tuned for that!

Web Scraping Service CTA
29 thoughts on “How to Use Web Scraper Chrome Extension to Extract Data
  • Avatar for David
    David

    You do realize that your screen shots are impossible to read right?

    • Avatar for Jacob Koshy
      Jacob Koshy

      Hi David, thank you for bringing this to our attention! We have fixed the screenshots and they’re legible now.

  • Avatar for Nick
    Nick

    Awesome thank you! Cleared some things up for me. Now i Can crawl a wholesaler account!

  • Avatar for John
    John

    Hi, How to crawl the data from google map using web scraper Chrome Extension. like Address, phone number, wbsites url. etc…
    I’m really struggling on this. i need you help.

  • Avatar for Artem
    Artem

    Great feature for pagination! Is it possible to setup pagination step though? Like if pagination url is changing by 10 items rather then by 1? something like “url[001-200;10]”?

  • Avatar for Mukul Raman
    Mukul Raman

    Hi,

    Do you have an y video tutorial on this

  • Avatar for John
    John

    Please help. I’m trying to crawl yellow pages data. I found a list of 64 pages of stores. I added a selector for business name, address and phone number. I right clicked each field for inspect/copy/copy selector for the name, address, and phone number. I scraped the URL changing only the end to read pages/[001-064]. I clicked crawl and to my surprise the only data scraped was for the page 001. I clicked the multiple tab in each selector field (for name, address and phone). Why did I only get data for the first page? Should the crawl tool know that I wanted the same data for each company (30 per page) for all 64 pages? Thanks in advance.

    • Avatar for Jacob Koshy
      Jacob Koshy

      Hi John, DIY web scraping tools such as this are usually meant to handle simple websites that use traditional navigation systems and coding practices. It appears that the site you are trying to crawl is a bit too complex for this DIY tool. Unfortunately, since these tools are not customizable, you won’t be able to do anything about this. It’s recommended to go with a dedicated web scraping service like ours if you want to overcome the limitations of scraper tools and get uninterrupted data.

    • Avatar for Darren
      Darren

      Hi John, try this fix (it worked for me)
      Go to “edit Metadata” and add the url of each page of the search results to the starting url list. It’s a bit messy, but it did work.

  • Avatar for akanksha rashmi
    akanksha rashmi

    Hi Jacob , I need to crawl a site which requires logging in .I then need to navigate to another link on the same page and crawl that page.Can you please help ?

    • Avatar for Jacob Koshy
      Jacob Koshy

      Hi Akanksha, that’s surely possible, but not with a DIY tool like the one we’ve discussed above. You can reach out via sales@promptcloud.com and our team will assist you with the requirement.

  • Avatar for Adriza Deo
    Adriza Deo

    Hi Jacob,

    I want to crawl the data of LinkedIn profile members. For example, multiple fields like name, title, location, company name, profile URL etc. I also want the pagination to happen automatically like I want the data from page 1 to 24. But the challenge is that instead of selecting multiple fields, the tool is considering only 1 filed which is selected in last and it’s not moving to other page as well. Under the “Start URL” option, I am pasting the link and after that I am also providing [001-024] so that while scraping the data it will move to other page. But the pagination is not taking place. Can you help me with it. Thank you.

  • Avatar for Akhil AR
    Akhil AR

    Hi!

    I need to crawl job descriptions from linkedin using web scraper plugin. I can crawl data from profiles but I am not able to crawl the job postings. Can you suggest something?

    • Avatar for Jacob Koshy
      Jacob Koshy

      Hi Akhil, you can try our newly launched job feeds solution for getting the job postings directly from company websites. You can check it out here: https:.//www.jobspikr.com

  • Avatar for Krishna
    Krishna

    Hi. How to crawl data on infinite-scrolling pages using this tool?

    • Avatar for Jacob Koshy
      Jacob Koshy

      Hi Krishna, infinite scrolling uses Javascript and it is unlikely that you’ll find a web scraping tool that can handle pages like that. It’s recommended to go with a managed web scraping service in such cases.

    • Avatar for yih
      yih

      Element Scroll Down selector

  • Avatar for vee
    vee

    can we crawl data from popup’s or dialog boxes which are not yet loaded using this?
    Something like in parseHub software. where we can create actions to dynamically open the popup to get data from that.

  • Avatar for Vinni
    Vinni

    How to crawl phone nos from Craigslist?

  • Avatar for scholar
    scholar

    i want to crawl the shopping websites like amazon,flipkart whether scraping using web scraper extension will give any legal problems.

    • Avatar for Preetish Panda
      Preetish Panda

      Please ensure that you’re following the guidelines set by the robots.txt file and adhering to the terms of service while applying the data.

  • Avatar for Sophie
    Sophie

    Hi, I’m trying to crawl text off 5 links on a webpage. It is working (this was super helpful 🙂 ) however only for the last link. How do I get it to work?!
    Many thanks 🙂

  • Avatar for Sophie
    Sophie

    Never mind I’ve got it to work thank you so much xx

  • Avatar for Anas Ansari
    Anas Ansari

    Hi, can anybody help me to scrap multiple full-size images from e-commerce website as I am unable to scrap all images with data. I am only getting thumbnails of it or only one full-size image. Please help.

    • Avatar for Preetish Panda
      Preetish Panda

      Hi Anas,

      You can reach out to our team (sales@promptcloud.com) with details of your requirement, i.e., sites to be crawled, frequency and data fields. We’d be happy to feasibility study and extend a free quote.

  • Avatar for Daniel Segun
    Daniel Segun

    it worked..nice series… i was able to crawl my own website

Leave a Reply

Your email address will not be published. Required fields are marked *

Generic selectors
Exact matches only
Search in title
Search in content
Filter by Categories
Blog
Branding
Classified
Data
eCommerce and Retail
Enterprise
Entertainment
Finance
Healthcare
Job
Marketing
Media
Real Estate
Research and Consulting
Restaurant
Travel
Web Scraping

Get The Latest Updates

© Promptcloud 2009-2020 / All rights reserved.
To top