Andrew Jefferson, the Chief of Information Technology of a leading data analytics and research firm, had been facing a lot of issues at work recently. His job was to maintain a continuous data flow into the company’s database for evaluation and analysis, while maintaining the servers of the company, so there is no obstruction in functioning of the firm. It was a golden time when the only problem he had to deal with was proper data structuring, back when the internet was still a new entity and not many ecommerce sites existed, and best of all, few unwanted ads, products or content disturbed his database building exercise. However in the last couple of years, data had become unmanageable for him and his growing team, and every day, they would face a fresh set of problems like server failure, structural clashes of data and above all the unwanted noise they received with each feed of data crawled. With the increasing business of the firm, the demand of structured, clean data was at an all-time high. Andrew knew, it was time to seek help outside the team to manage the functions at the firm properly.
How did PromptCloud Help?
PromptCloud helped include a data layer into their current set-up that would allow continuous free flowing feeds free of “noise” so that the team could only focus on interesting approaches to analytics
A crawler was set up that could extract product prices and specifications only for predefined categories in an automated manner on a daily basis.
Based on the schema provided by client, the final data was delivered in an XML format via the Data API on a daily basis without any manual intervention from either side.
Each record within a dataset had all details i.e. product name, product price, availability status, short and long descriptions, all image URL’s, SKU, dimensions, category, brand, source and the source URL from where it was fetched.