Service type: Site-specific crawl and extraction
Challenge: The client wanted to run their pricing intelligence system using price data extracted from their competitors. Since the target websites had complex and dynamic coding elements and hundreds of thousands of hotel listings, the scraping project demanded an extensive infrastructure and high-end resources. The client didn’t have the technical know-how to go about this and wanted a fully-managed service that can take end-to-end ownership of the process. Another key requirement was that the data must be extracted at a frequency as high as twice a day, which again is resource-intensive.
The Solution: The client shared the detailed requirements including the target sites, crawling frequency, their preferred data delivery format and the data points they wanted to extract from these sites. This use case comes under our site-specific crawl offering since the websites in the list had different structuring and design. The client needed the extracted data in JSON format, and were ready to use the PromptCloud API to access the extracted data at their end.
As per their instructions, the different target sites had to be crawled at different frequencies, including twice a day, fortnightly and daily. Our team completed the crawler setup for the 3 target sites in just 5 days and the initial set of data files were delivered to the client. About 2.5 million records were delivered to the client during the first crawl.
- The client got easy and straightforward access to about 1 million price points daily from their industry.
- The complicated aspects of web crawling and extraction were taken care of by our team.
- It took only 5 days for the setup process and the data flow was consistent after this.
- We set up automated and manual layers of monitoring to be notified of target site changes.
- Our extensive tech stack could handle this high scale data extraction effortlessly.
- The client could gain a significant edge in the competition with the pricing intelligence derived.
- Since our system sent out notifications on new data extracted, the client had the flexibility of importing new files into their system only when new data was available.
- A cost savings of about 37% was achieved by the client by not having to set up an in-house crawling team.
- With our low turnaround time, client had the advantage of surplus time to make calculated moves with the use of data.
- After the initial setup phase, the whole process was automated and no disruption in service ever surfaced.