Company: A popular healthcare research firm from Netherlands.
Context: Client wanted to extract medicine details from the catalogs of leading pharmaceutical portals.
Client wanted to extract data for medicines under all the categories available on the pharmaceutical portals at regular intervals to perform analyses over a period of three years. This would result in significant record volume requiring an infrastructure capable of handling big data.
The data points that they needed were:
Our team built a custom crawler to regularly extract the required data at scale and deployed it. We also set up monitoring for the target websites in order to promptly update the crawler if the site structure changes. The client was provided with API details and documentation to query and fetch the crawled data.