Clicky

What enterprises do with Big Data- Part 2 | PromptCloud
 

What enterprises do with Big Data- Part 2

What enterprises do with Big Data- Part 2

It’s amusing how Big Data is knocking doors these days and so it took a while to settle down with the overwhelming response on the previous post. Here’s the next one.

Notes-

a) Notes that applied to the previous batch apply here too.

b) Only public data gets crawled in the process and robots.txt is strictly adhered to.

  1. We’d like to know what’s going on in home appliances industry within these countries with respect to these brands. Please get us all the data you come across on forums, blogs, news or other reviews.
  2. We are into building product-centric search engine. Keep collecting product catalogs from this huge list of sites that we give you. But what’s more important is sticking to this deeply involved schema we suggest. Oh wow! you can discover sites as well? Bring it on..
  3. You understand how important it is to get all possible data when building an online classifieds engine. We already crawl a number of sites for this. Here’s the list that we can’t crawl, please fetch all data

    from them.

  4. Currently we have a big team of data entry guys who manually search for relevant real-estate listings in our target cities. Since we’d like to get this automated, please acquire data from all of these real-estate websites.
  5. We are in the process of building an expedia for oceans and seas. Since the websites in our list are not just plain HTML, please help us collect all shipping lines and sea routes from these websites on a daily basis.
  6. I’m building a travel search engine. Yes there are many already out there but we bring an added layer of social and some machine learning on the ground data. To facilitate this idea, we are interested in gathering hotel addresses and reviews, destination reviews, traveler photos and author profiles across the major travel websites.
  7. We are aiming to become THE single database for all reviews- products or travel. On top of that, we would also like to add a categorization layer so please get us all relevant data from various sites with these markups.
  8. I would ultimately like all postings by user and originating URL that correlate with selected shopping site I provide like Amazon, Victoria Secrets, Etsy etc, and so would like to collect as many feeds as possible from this popular content sharing website.
  9. We are into all things tech. We have a variety of projects that range from getting all the tech questions ever asked on tech forums to getting the list of all file extensions from around the web. Not many record-level details required here but all’s in the name for us. Speed is prime so quick turnaround is appreciated.
  10. We are looking to crawl a bunch of HTML pages and track server response codes.
  11. We would like to track any news on these car models in this particular country. Please provide a data dump of the same every 2 days.
  12. Crawl about 100 e-commerce websites to bring together product images. Using this, we’d like to analyze what kind of products is appealing the audience these days.
  13. Get me product feeds from the Indian E-commerce market with all product-level details and specifications. I need this to build an analytics engine.
  14. I have a simple one-time request. Please compile a database of all business listings in my locality.
  15. We measure marketing performances of enterprises and optimize their marketing budgets. For one of such projects, we would like to track few mobile handsets on a weekly basis. We would like to see the different levels of pricing/financing options available as well as how the ratings vary.
  16. We have built a lot of crawlers ourselves and are just not happy with the way we have done it. We are looking for a solution that can eventually crawl more than 5000 retail sites. We will use this data to deliver innovative software solutions.
  17. We perform a lot of competitive intelligence internally for our clients and are looking for a more automated solution where we can feed in keywords, and query the data depending on client requirements. So also index the data that you deliver.
  18. We have built a profiling system for lawyers based on several parameters. To enable this, we research the web extensively. Our researchers simply Google using lawyer’s name and about 100 relevant keywords for each to fetch the results that can go into our system. We need a solution like yours to automate the process and summarize the retrieved data in an easily editable format.
  19. We provide some thought provoking content to our readers from politics, sports, and other everyday things. Editorial being our strength, we would like a solution to facilitate all this content based on some keywords that we provide to you.
  20. We are interested in making a semantic job search engine for which we would like to acquire all job listings from these sources and catalog them.
Yes, there will be more batches. Stay tuned…

Related Posts

No Comments

Post A Comment

Ready to discuss your requirements?

REQUEST A QUOTE
  • This field is for validation purposes and should be left unchanged.
  • This field is for validation purposes and should be left unchanged.
  • This field is for validation purposes and should be left unchanged.
  • This field is for validation purposes and should be left unchanged.
  • This field is for validation purposes and should be left unchanged.
  • This field is for validation purposes and should be left unchanged.
  • This field is for validation purposes and should be left unchanged.
  • This field is for validation purposes and should be left unchanged.
  • This field is for validation purposes and should be left unchanged.

Price Calculator

  • Total number of websites
  • number of records
  • including one time setup fee
  • from second month onwards
  • This field is for validation purposes and should be left unchanged.
  • This field is for validation purposes and should be left unchanged.
  • This field is for validation purposes and should be left unchanged.
  • This field is for validation purposes and should be left unchanged.
  • This field is for validation purposes and should be left unchanged.
  • This field is for validation purposes and should be left unchanged.
  • This field is for validation purposes and should be left unchanged.
  • This field is for validation purposes and should be left unchanged.
  • This field is for validation purposes and should be left unchanged.