The client wanted a product scraper to extract data from around hundred plus fashion sites including GAP, Macys and Nordstrom brands. The required data points were product data, along with all possible variants of a particular product like different colors and sizes.
The client provided us with the list of source websites to be crawled and the data points required. The extraction frequency was set for a daily basis.
Our team set up crawlers to fetch the required data fields from the source sites. This use case comes under our site crawl offering since the source websites had various format and design.
The client needed the extracted data in CSV format and be uploaded to their S3 servers. The initial setup was complete in a few days and the crawlers started delivering data immediately.
About 200 k records were delivered to the client during the first crawl.