Q. What is the maximum frequency you can crawl data at?
Frequency would depend on your specific requirements. We can extract the data at a frequency ranging from a few minutes to once in a month.
Q. Can we get contextual information from the web page?
It would depend from site to site. However, we should generally be able to provide you with the preceding URL from where we discovered the final page URL.
Q. How do you maintain your code in order to deal with website structural changes?
While setting up crawlers, we setup automated check points to monitor structural changes. In case a site changes its structure, we would be notified and shall fix them accordingly.
Q. How do we access data on our side?
The data can be delivered in XML, JSON or CSV format. The default mechanism for delivering data is via our RESTful API. We can also push the data to one of your file sharing servers (FTP, SFTP, Amazon S3, Dropbox, Gdrive, Box or MS Azure). If you’re not very technically inclined, you can simply use the one-click data download option on CrawlBoard.
Q. Do you have an IP rotation service?
Yes, our platform, by default, handles IP rotation and mechanism to handle other common blocking issues.
Q. What kind of infrastructure do you offer?
As a client, you’d have access to our portal – CrawlBoard. This would be your centralized portal for technical support, billing and keeping a tab on the crawler activities and stats. You’d also be able to schedule ad-hoc crawls for the future. Error handling happens via our ticketing system.