The client, from Research and Analytics industry, was looking to for a continuous free of quality and clean eCommerce data feeds, so as to power their research and analytics.
They needed easy access to a complete product listing data from specific categories, along with all the product specifications and pricing. The client previously had an internal data team that manually gathered data from various web sources but the results were limited and efforts were high.
Along with the intensive manual effort, data structuring while importing the eCommerce data into their database was a challenge. They needed clean data in their format, so it could be easily uploaded into their internal database, in order to run the comparison engine and perform other monitoring activities.
The client provided us with the list of sources to be crawled, data points required and frequency of data extraction, which was set for daily.
Our team set up the crawlers to fetch the required eCommerce data fields from the specific source sites. This use case comes under our site crawling offering since the source websites had different structuring and design.
The client needed the extracted data in CSV format and to be uploaded to their S3 servers. The initial setup was complete within a few days and the crawlers started delivering the required data immediately.
About 200K records were delivered to the client during the first crawl.