Use case from Site-specific crawl and extraction
The Client: An online portal connecting B2B marketplaces
The Challenge: The client wanted to bring together the marketers in the B2B arena at a single point where they could interact and make buying decisions. The critical part of such a solution was to facilitate information from various B2B industries that visitors could leverage for their partnerships. One of such solutions was focused on collecting all the real-estate listings in a country.
The Solution: The list of sources that hosted relevant information was identified. A data acquisition pipeline was then set up that could collect all property listings, their types, description, images, value, and other details in a structured format. These sources were crawled on an incremental basis and all the incremental information was uploaded to the API from where it could be downloaded and put to use.
- Although the data volume was huge, the pipeline still ran smoothly without any customer involvement
- Addition of more fields were taken care of at any point in time
- Precision extraction helped in capturing even the subtle property details
- More than 10 million records were uploaded, all noise-free
One of our slideshare decks discusses generic classifieds use cases.