The client wanted to extract credit card offers and other promotional information from bank websites to fuel their comparison engine. Since the target sites had dynamic and complex coding elements, the crawling project demanded an extensive infrastructure with high-end resources. The client lacked the technical know-how to go about this and wanted a fully-managed service that can take end-to-end ownership of the process. The data was to be extracted on a weekly basis and delivered in a clean and ready-to-use format.
The client shared the specifics of their requirements such as the target sites, crawling frequency, preferred data delivery format and the data fields they wanted to crawl from the sites. This use case comes under our site-specific crawl offering since the websites in the list had different structuring and design. The client wanted the extracted data delivered to their Dropbox account in JSON format.
We set up the crawlers for the target sites in just 3 days and the initial set of data files were delivered to the client. About 10,000 records were delivered to the client during the first crawl.
1. The client got easy access to fresh “offer data” from the popular bank sites.
2. All the complicated processes associated with crawling were taken care of by our team.
3. It took only 3 days for the setup to complete and the data flow was consistent post this.
4. We set up automated and manual layers of monitoring to be notified of target site changes.
5. Our extensive tech stack could handle the high-scale demands effortlessly.
6. The client could power their comparison engine with the delivered data without any further processing.
7. Since our system sends out notifications on new data extracted, the client had the flexibility of importing new files into their system only when new data was available.
8. A cost savings of about 60% was achieved by the client by not having to set up an in-house crawling team.
9. With our low turnaround time, client had the advantage of surplus time to make calculated moves with the use of data.
10. After the initial setup phase, the whole process was automated and no disruption in service ever surfaced.