Scraping data from webpages for sentiment analysis

New tools today make it possible for businesses to understand how their customers are reacting to them – do customers prefer the layout, find the offers exciting, did the service satisfy them? The increased volume of data is valuable not just to gauge success but also draw insights from for the future.

As a Data-as-a-Service provider we realise the significance of this data and help you unlock valuable insights by collecting this data. What we do is scrape sites and extract structured data at scale, that can be used to arrive at insights. Scraping data from webpage for sentiment analysis is an important service we provide.

Extracting user review about products

As a web scraper we make it easy to get data from the web. Ours is a customized service where all you do is give us the list of sites you want data from, indicate the fields desired and the frequency you want the data at. Using our customized crawlers and advanced computing stacks, we launch scrapes and retrieve the data in the format you desire (usually XML, JSON, CSV). You can query for this data via our REST-API or even have the data delivered to your FTP / AWS location.

See sample social media data. 

What is the use of this? Why is sentiment analysis important?

4507954714_02aa289e88_z

While data scraping is quite challenging in itself, we do reflect on how opinion mining can help our enterprise clients better. Opinion mining, better known as Sentiment Analysis deals with automatic scanning of text and establishing its nature or purpose. Fundamentally, it is important to determine if text scraped and extracted from a website is useful or not; or even whether it relates with the subject that is mentioned in the title.

The function of sentiment analysis can be to analyse entries (user reviews, product feedback, service feedback forms etc.) and indicate feelings expressed (happiness, dissatisfaction etc.). On a simple scale, this can be achieved by establishing a scoring system from 1 – 10 with 10 being most positive (or such similar measure) where each word is generally associated with an emotion. The score of each word, and whole text, is then calculated to see what the opinion/ sentiment indicated.

Another methodology is subjectivity/objectivity identification. Here, extracted data is tested for being subjective or objective. However, this may prove to be difficult since results of estimation are person-specific (or subjective).

Perhaps the most refined kind is the ‘feature-based sentiment analysis’. Here, individual opinions of users are extracted from text regarding a certain product or service and then evaluated to see if the consumer is satisfied or not. This is where PromptCloud’s mass-scale crawling solution helps. For example, if you wish to crawl hundreds of thousands of blogs, news, or forum sites to extract very high-level information like article URL, date, title, author and content, mass-scale crawls will provide this data in a structured format as continuous feeds.

We could also filter these crawls based on a list of keywords to facilitate better sentiment analysis based on subject topic, language and even keyword detection. Our named-entity recognition service only helps to enrich this information.

We helped a client with sentiment analysis for a product. The client wanted to capture comments about it from forums and Web sites, from retailers and distributors to enthusiasts to the average consumer. The client’s use case was to get data so as to understand how favourable users found a product, and what consum3562678250_4c641e9737_zers talked about it on the Internet. Competitive analysis was another scenario to study as well. While Twitter provided a very clear picture, it wasn’t going to be help our client with the breadth of insights desired.

Considering that there are hundreds of websites that may include product reviews and numerous online forums focused on consumer durables and/ or related topics you have a valuable collection of insights.We set-up crawls to extract reviews from a select highly valued sites with hundreds of URLs automatically.

Our automated web data extraction and monitoring solution targeted sites and delivered precise results. Moreover with normalizations in place, we delivered analysis-ready structured data.

The PromptCloud Advantage:

  • Data you want, is data you get
  • Mitigating site complexity to ensure easy and complete access
  • Regular alerts on data feed uploads and an interactive API system to query data from
  • Efficient and simple process to get crawls running
  • Site maintenance and monitoring to record any changes in structure so as to provide uninterrupted data coverage
  • Cost-effective scalability

Image credits : datafloq

Submit Requirement
  • This field is for validation purposes and should be left unchanged.
  • This field is for validation purposes and should be left unchanged.
  • This field is for validation purposes and should be left unchanged.
  • This field is for validation purposes and should be left unchanged.
  • This field is for validation purposes and should be left unchanged.
  • This field is for validation purposes and should be left unchanged.
  • Click here to see if your requirement is a right fit for our services.
  • This field is for validation purposes and should be left unchanged.
  • This field is for validation purposes and should be left unchanged.
  • This field is for validation purposes and should be left unchanged.

Price Calculator

  • Total number of websites
  • number of records
  • including one time setup fee
  • from second month onwards
  • This field is for validation purposes and should be left unchanged.
  • This field is for validation purposes and should be left unchanged.
  • This field is for validation purposes and should be left unchanged.
  • This field is for validation purposes and should be left unchanged.
  • This field is for validation purposes and should be left unchanged.
  • This field is for validation purposes and should be left unchanged.
  • This field is for validation purposes and should be left unchanged.
  • This field is for validation purposes and should be left unchanged.
  • This field is for validation purposes and should be left unchanged.

  • This field is for validation purposes and should be left unchanged.

Software Engineer – Full time

We’re a team of high-tech Engineers working on fresh big data problems. An integral part of our offerings is web-scale crawl and extraction using cloud computing and machine learning techniques. We’re on a quest for innovative ways to solve the business problems of data acquisition and normalization on the web. Our vision is to make PromptCloud a one-stop brand for data and our growth is geared towards that.
Where we are at PromptCloud- We are a bootstrapped company in the mid-growth phase and are planning to quickly expand (not much in terms of personnel) but heavily with respect to the solutions we can provide to big data problems in the market. We started off with international clients and we have pretty much covered the globe at that.

What PromptCloud expects for this role:

– Sound knowledge of Algorithms and OOP concepts
– Proficiency with Linux/Unix (required)
– Knowledge of any one of the scripting languages – Ruby/Perl/Python
– Graduated from a tier-1 college (IITs, NITs, IIITs, BITs) or you’re dead smart to blow us away with your tech skills
– 1 to 3 years of industry experience in a tech role
– Prior experience with a startup or Big Data technologies is a plus
– Prior exposure to web technologies, Rails, Django is a plus
– Energy and passion for working in a growing company
– Sense of ownership and attention to detailsDevOps experience
– An entrepreneurial and experimental mindset

What you will receive:

– Truckloads of learning
– Friendly environment and a culture for growth
– Collaboratively solving exciting challenges with smart minds around
– Busy days and busier nights that you won’t regret
– All things good or great at any bootstrapped company

Ruby on Rails Developer – Full time

We’re a team of high-tech Engineers working on fresh big data problems. An integral part of our offerings is web-scale crawl and extraction using cloud computing and machine learning techniques. We’re on a quest for innovative ways to solve the business problems of data acquisition and normalization on the web. Our vision is to make PromptCloud a one-stop brand for data and our growth is geared towards that.
Where we are at PromptCloud- We are a bootstrapped company in the mid-growth phase and are planning to quickly expand (not much in terms of personnel) but heavily with respect to the solutions we can provide to big data problems in the market. We started off with international clients and we have pretty much covered the globe at that.

We’re looking for a Ruby on Rails developer to take over various responsibilities in design and development of rails application. The role involves client side as well as server side expertise.

What PromptCloud expects for this role:

– At least 1-2 year experience working with Ruby on Rails (or Django) and MVC
– Experience working with HTML5, CSS3, JS, jQuery, AJAX, and other web technologies
– Experience working with Linux
– UI/UX Design and front end development experience
– Ability to write well-abstracted, reusable code for various UI components
– Should be willing to work independently and take end-to-end ownership with minimal guidance
– Excellent time-management, multi-tasking, communication and interpersonal skills.
– Must have great design and good documentation skills

Applicants having the following skills will be given preference:

– Knowledge of open source tools such as Firebug, Chrome developer tools
– Experience working with Twitter Bootstrap & Node.JS
– Work independently and end-to-end with minimal guidance
– Ability to write well-abstracted, reusable code
– Excellent time-management, multi-tasking, communication and interpersonal skills
– Must have great design and good documentation skills

What you will receive:

– Truckloads of learning
– Friendly environment and a culture for growth
– Collaboratively solving exciting challenges with smart minds around
– Busy days and busier nights that you won’t regret
– All things good or great at any bootstrapped company

DevOps Engineer – Full time

We’re a team of high-tech Engineers working on fresh big data problems. An integral part of our offerings is web-scale crawl and extraction using cloud computing and machine learning techniques. We’re on a quest for innovative ways to solve the business problems of data acquisition and normalization on the web. Our vision is to make PromptCloud a one-stop brand for data and our growth is geared towards that.
Where we are at PromptCloud- We are a bootstrapped company in the mid-growth phase and are planning to quickly expand (not much in terms of personnel) but heavily with respect to the solutions we can provide to big data problems in the market. We started off with international clients and we have pretty much covered the globe at that.

Responsibilities:

– Ensure 100% availability and reliability of our service,
– Help the company make optimized infrastructural choices.
– Create and implement tools that manage infrastructure.
– Work independently and end-to-end with minimal guidance.

Skills:
On the system side
– Experience with Puppet/Chef/Ansible, Amazon Web Services (AWS), Git, Graphite and related tools for large-scale systems management
– Experience working with linux system monitoring and analysis
– Good understanding of distributed computing environments
– Open to working non-standard hours in critical situations
– Replication of databases (both relational or NoSQL) across geographical regions using various consistency models
– Experience with package management systems such as APT or RPM
On the development side
– Basic coding/scripting ability in Ruby, Python or any other scripting language
– Write clean, elegant and reusable OO code. This is not a sysadmin job.
– Ability to write well-abstracted, reusable code and good documentation skills
– Some understanding of REST and other web technologies is a plus.
– The ideal candidate for this position should have good attention to details, along with excellent time-management, multi-tasking, communication and interpersonal skills.

What you will receive:

– Truckloads of learning
– Friendly environment and a culture for growth
– Collaboratively solving exciting challenges with smart minds around
– Busy days and busier nights that you won’t regret
– All things good or great at any bootstrapped company

Marketing Manager – Full time

PromptCloud is looking for a growth hacker to take over responsibilities in developing and executing strategic marketing programs to meet growth objectives, develop market awareness and communicate client results. The analyst works with the Digital and Inbound Marketing Manager and other business segment executives, product management and sales leadership in supporting sales tools to attract, win and retain clients. Essentially, this profile touches all aspects of marketing and business development, thus evolving into being a successful growth-in-charge for the organization.
He/she partners with all functions within marketing as well as key external departments to drive successful, cross-functional marketing and sales initiatives and contributes towards the successful development and maintenance of key performance indicators that lead PromptCloud to 5x growth in the next 3 years.

Desired Skills and Experience:
– Tech (CS) along with an MBA degree from a top B-school
– Minimum 2 years of experience in a marketing/business analyst role
– Good experience in top and bottom level funnels
– Strong business acumen, highly developed analytical skills and recognized for innovative and creative approaches to problem solving
– Inherent empathy for the customer
– Highly professional written and verbal communication skills, along with interpersonal skills
– Hands-on experience with blogging, copywriting, content marketing & PR
– Experience in social media marketing (planning and execution) is a plus
– Experience working with a B2B/enterprise startup is a big plus
– Excellent time-management, multi-tasking and interpersonal skill

Responsibilities include:
– Understanding PromptCloud’s technology well as you progress, and using it to ideate on marketing strategies across geographies
– Performing timely market researches driven towards opening newer marketing and revenue channels that would directly affect technology roadmap (essentially growth hacking) and tracking various growth and marketing initiatives using key metrics customized to PromptCloud’s environment.
– Growing a community of big data enthusiasts on social media sites like LinkedIn and Twitter, and the technology ecosystem, and prospecting within the group
– Coordinating with vendors on collated improvements to company website via A/B testing, etc. and maintaining the same
– Collaborating with the SEO team on launching and monitoring PPC campaigns
– Creating content for various online and offline channels, and further coordinating with vendors towards this goal
– Publicizing content generated on various online and offline channels, as deemed appropriate with regular research
– Taking end-to-end ownership of tasks with moderate to minimal guidance.

What you will receive:

– Truckloads of learning
– Friendly environment and a culture for growth
– Collaboratively solving exciting challenges with smart minds around
– Busy days and busier nights that you won’t regret
– All things good or great at any bootstrapped company

(This position is currently inactive, you can still apply if you’re interested.)

PromptCloud is looking for an ambitious content marketing all-rounder who would be responsible for writing well-researched and informative content for the blog, website and other marketing collateral. The person should be an auto-pilot and self- thinker and have strong convictions along with efficient proofreading and paraphrasing skills. There will be a steep learning. As a company, we like to work with people who are smarter than their years, fast learners, and quick thinkers. Responsibility-shirkers need not apply. This is an amazing place to grow rapidly and make a real impact. If you’re someone who doesn’t tolerate low expectations, then you might be the right person for our team.

Job Responsibilities:
– Responsible for generating new opportunities: key business function is prospecting new accounts and also to manage and execute research activities as required to compile successful campaign target lists.
– Responsible for segmenting and identifying qualified outbound leads via various communication channels through in-depth understanding of their use case.
– He/she will assist in expanding the company’s database of prospects.
– He/she will be responsible for retargeting of outbound leads and launching various client-nurturing campaigns and account management including up-sell and cross sell.
– He/she will be responsible for arranging various outbound campaigns for increasing brand awareness and revenue.
Establish annual, quarterly, monthly, or weekly sales and collection plans and prioritize and schedule own –
activities so these targets are met.
– Responsible for responding to RFPs and other queries from outbound leads.
– Advising customers on forthcoming product developments and discussing special promotions.
– Understand PromptCloud’s technology well enough.

Desired skills and experience:
– Should be an MBA with a tech background.
– 2 – 4 years of experience in a similar role.
– Experience of software sales/ business development.
– Should be a Techno functional who can map Technology to Business Processes.
– Excellent written skills and ability to communicate well with clients from various geographies.
– Should have experience in handling customer queries.
– Excellent listening skills.
– Ability to understand customer’s industry and core business processes, and then identify the problems they are facing.
– Ability to understand and describe how solutions and features can address the business issues that customers are facing.
– Target focused individual contributor.
– This is an independent role with minimal guidance / interference so give us a shout only if you can ideate as well as execute end to end.

What you will receive:

– Truckloads of learning
– Friendly environment and a culture for growth
– Collaboratively solving exciting challenges with smart minds around
– Busy days and busier nights that you won’t regret
– All things good or great at any bootstrapped company

(This position is currently inactive, you can still apply if you’re interested.)

Content Writer & Social Media Marketer – Full time

PromptCloud is looking for an ambitious content marketing all-rounder who would be responsible for writing well-researched and informative content for the blog, website and other marketing collateral. The person should be an auto-pilot and self- thinker and have strong convictions along with efficient proofreading and paraphrasing skills. There will be a steep learning. As a company, we like to work with people who are smarter than their years, fast learners, and quick thinkers. Responsibility-shirkers need not apply. This is an amazing place to grow rapidly and make a real impact. If you’re someone who doesn’t tolerate low expectations, then you might be the right person for our team.

Desired Skills and Experience:
– Excellent written communication skills
– 1-2 years of full-time experience with content writing
– Should have exposure in public writing, blogs
– High attention to detail
– Hands-on experience with blogging, copywriting, content marketing & PR
– A good hang of the nuances of English language and grammar rules
– Experience in social media marketing (planning and execution) is a big plus
– Ability to learn and adapt to the latest online marketing trends
– Experience working with a B2B/enterprise startup is a plus
– Excellent time-management, multi-tasking and interpersonal skills

Responsibilities include:
– Writing engaging content for the blog, website and emails
– Creating newsletters targeted towards a B2B audience
– Generating SEO-friendly content across sources and sharing them on social media channels
– Independently coming up with GREAT content from time to time
– Developing measurable content marketing strategies
– Growing a community of big data enthusiasts on social media sites like LinkedIn and Twitter
– Understanding PromptCloud’s technology well as you progress
– Taking end-to-end ownership of tasks with moderate to minimal guidance
– Should have understanding of promotion of own articles on various platforms
– Should be familiar with keyword research and ROI on content in terms of visitors, likes, comments, leads etc
– Take ownership of on-page SEO and content promotion on various platforms including but not limited to blogs, forums and social media sites
– Have a basic understanding of Google webmaster and Analytics
– Assist in content strategy and keyword strategy to increase traffic for the website
– In-depth knowledge of and enthusiasm for social media along with demonstrated awareness of social media trends/developments
– Ability to work in a fast-moving and team-oriented environment
This is an independent role with minimal guidance / interference so give us a shout only if you can ideate as well as execute end to end.

What you will receive:

– Truckloads of learning
– Friendly environment and a culture for growth
– Collaboratively solving exciting challenges with smart minds around
– Busy days and busier nights that you won’t regret
– All things good or great at any bootstrapped company

(This position is currently inactive, you can still apply if you’re interested.)