By offloading the collection and maintenance of news data to Automatic Extraction, it allowed the client to focus on product development and business strategy. The provision of such data has allowed the client to accelerate their product development process without worrying about the maintenance of their data pipeline. This has enabled them to become one of the most innovative technology company’s today.
About the Client
Client: The client is a technology company that helps readers engage with the publishers. They build tools for individuals and publishers to access and present personalized and trusted news. Providing structured news data for personalized feeds, backed by artificial intelligence and algorithms, that will be based on a user’s preferences and what experts are interested in.
Using multiple APIs that focus on user signals, the client powers newsletters on behalf of publishers that provide impressive personalization between publisher and reader.
Industry: Media Industry
News data extraction In today’s digital world, news sources are abundant throughout the world, available at the fingertips of readers. The way people consume news is changing with users seeking alternatives to social media. There is a demand for information that fits their interests and for trusted journalism. With more and more people having access to the news through smart devices and hunger for personalized news content, there is a problem in categorizing the colossal amount of news data on a daily basis.
To keep their business running, the client needed to deliver quality sources of information that the reader can trust and help their publisher partners better connect with their readers. This required gathering a lot of news data from thousands of different sources across the web, that too quickly. The client was required to source millions of news data sources daily, which required extracting the world’s news data accurately and reliably. The client started looking for a date extraction partner who can provide them with authentic data quickly. They were looking for someone with the fastest turnaround time in web scraping. Therefore the key challenge facing them was speed and scaling. Also, they had to consider the list of sources in their directory is constantly growing and evolving.
After considering the costs of hiring an internal team, including training, on-boarding, and setup, they decided on outsourcing the project and began to assess the types of businesses that could provide the data extraction capability they needed. PromptCloud was the clear choice based on our ability to provide data extraction on a grand scale, accessibility to the data, the quality of the data, and the speed at which the data was received.
PromptCloud provided the client with an easily manageable solution where they were able to identify sources and maintain a consistently high level of data quality when collecting news data. This also allowed the client to precisely tweak the kind of data they required at scale efficiently and effectively. With PromptCloud, they were immediately able to scale their data extraction efforts to match the quantity of news content produced daily.
“PromptCloud has helped us extract over 10 million articles for our technology to process. The data is constant and reliable. Collaboration with PromptCloud has been easy and support was always there throughout our journey.”, says the Founder.