Description: Common Crawl is a non-profit organization that crawls the web and makes web crawl data available to the public for free. The data can be used by researchers, developers, and entrepreneurs to build interesting analytics and applications.
Type: software
Description: Searx is an open source, privacy-respecting metasearch engine that can be self-hosted. It allows users to search multiple search engines while not tracking or profiling them.
Type: software
Pricing: Open Source