This data would usually contain noise and needs to be cleaned up. Noise is the unwanted html tags and pieces of text that get scraped along with the required data. A cleaning setup can be used to remove the noise, leaving only the relevant data behind. Once the data is free from noise, it has to be structured. Structuring is done in order to make the data machine-readable. This will make it easy for the analytics system to read the data with context. It also helps you easily import this data into a database.