Flat 30% discount on all the ready-to-use data sets available on DataStock. Apply coupon DATAFEST30 at checkout.

Travel Reviews Aggregation

Use case from Site-specific crawl and extraction

 

The Client: Social Travel Engine

The Challenge: The client was looking to build one of the web’s largest review database by aggregating scattered reviews on hotels and destinations from across sources. They had tried few solutions around crawling but issues had started creeping in as data scaled given they needed new data regularly. Also the number of sources were increasingly exponentially on the web and so was the data. Additionally, they wanted reviews from all countries in all languages and the author profiles, images, etc. from the pages.

 

The Solution: All historical data from each source (~100) was extracted in parallel with incremental data as reviews were published. Data was deduped before delivery so only new data got uploaded. Machine learning techniques were employed for adaptive crawling thereby crawling more active pages more often than others. Site list was dynamically modified based on client requirements. Over 20 million structured records were delivered in a period of 2 months.

 

    Benefits

  • Scalable platform took care of high data volumes without affecting data quality
  • Development and maintenance costs dropped to zero
  • Abstracted clients from technical specifics
  • Having only relevant data helped the client gain credibility in the market and rocketed growth figures

 

One of our slideshare decks discusses generic travel use cases.

SUBMIT REQUIREMENT
Talk to us!
  • This field is for validation purposes and should be left unchanged.
  • This field is for validation purposes and should be left unchanged.
  • Please submit the requirement on CrawlBoard if you're looking to crawl less than 3 sites.
  • This field is for validation purposes and should be left unchanged.
  • This field is for validation purposes and should be left unchanged.
  • This field is for validation purposes and should be left unchanged.
  • This field is for validation purposes and should be left unchanged.
  • This field is for validation purposes and should be left unchanged.
  • This field is for validation purposes and should be left unchanged.
  • Please submit the requirement on CrawlBoard if you're looking to crawl less than 3 sites.
  • This field is for validation purposes and should be left unchanged.

Price Calculator

  • Total number of websites
  • number of records
  • including one time setup fee
  • from second month onwards
  • This field is for validation purposes and should be left unchanged.
  • This field is for validation purposes and should be left unchanged.
  • This field is for validation purposes and should be left unchanged.
  • This field is for validation purposes and should be left unchanged.
  • This field is for validation purposes and should be left unchanged.
  • This field is for validation purposes and should be left unchanged.
  • This field is for validation purposes and should be left unchanged.
  • This field is for validation purposes and should be left unchanged.
  • Mary
    Sorry, we are offline right now. Please leave a message and someone will reach out to you soon.