Data is, by far, the single most indispensable entity on which a business stands these days even before standing on brick and mortar.
We have already discussed various use cases that validate the above statement in our older posts on What enterprises do with Big Data series. But here’s a specific use case that deserves undivided attention in lieu of intense research that it facilitates.
Sometime ago, an enterprise that leverages social media for its research, was discussing their problem of getting hold of large-scale data with us. The objective was catching up with social media feeds in near real-time AND location-wise. Data was intended to be from across the spectrum- news, blogs, reviews, Twitter, etc. A perfect solution was not supposed to compromise on-
The following considerations were taken into account to set up the mass-scale low latency pipeline for these set of requirements.
Geo-specific low latency crawls for dynamic list of sources and keywords