Description: ACHE Crawler is an open-source web crawler written in Java. It is designed to efficiently crawl large websites and collect structured data from them.
Type: software
Pricing: Open Source
Description: StormCrawler is an open source web crawler designed to crawl large websites efficiently by scaling horizontally through Apache Storm. It is fault-tolerant and allows integration with other Storm components like machine learning pipelines.
Type: software
Pricing: Open Source