Did you know that there are 12 factors to be considered while acquiring data from the web? If no, fret not! Download our free guide on web data acquisition to get started!
A popular electronics manufacturer from Japan needed to scrape twitter to analyse brand and product affinity on social media.
The client was planning to do sentiment analysis on the top tweets mentioning their product or brand name. To achieve this, we need to scrape twitter for tweets mentioning their product and brand names had to be extracted along with the twitter handle, number of likes, number of retweets, hashtags used and the URL of the tweet.
Our team was provided with the list of keywords to be monitored while crawling twitter. The crawl had to be repeated every day to fetch new records. For this, our team programmed a crawling setup that could monitor twitter for the given set of keywords and fetch the posts that had these keywords. For every detected instance of the keywords, the crawler would extract the required data points. Although twitter has its own API, we had to use a custom crawler to effectively handle the requirement. The client opted for data to be delivered in XML format to their Dropbox account. It took only 2 days to complete the initial setup; after that the data flow started. The client could perform sentiment analysis using the provided datasets in a matter of few days.