Description: Apache Nutch is an open source web crawler software project written in Java. It is used to build web search engines and web archiving systems. Nutch can crawl websites and index page content and metadata.
Type: software
Pricing: Free
Description: Crawlbase is a website crawler and scraper that allows you to extract data from websites. It has a simple interface for creating crawling jobs and lets you scrape content into CSV files or databases.
Type: software