ArchiveBox

ArchiveBox

ArchiveBox is an open source self-hosted web archiving solution that lets you archive web pages and collect media assets. It aims to create local, browsable copies of sites from the internet.
ArchiveBox image
archiving web-archiving selfhosted open-source

ArchiveBox: Open Source Web Archiving Solution

An open source self-hosted web archiving solution for creating local, browsable copies of websites and collecting media assets.

What is ArchiveBox?

ArchiveBox is an open source self-hosted web archiving solution designed to allow anyone to easily collect and archive content from the internet to create their own personal web archive.

It works by allowing users to submit URLs which ArchiveBox will then fetch, extract assets from, render snapshots of, and archive the resulting data. The archived content can include the original HTML, PNG/JPEG snapshots, assets like JS/CSS/images, extracted text/hyperlinks, bookmarks, metadata, and more.

Once archived, all the information for a given site is organized neatly into a folder which contains plain TXT/HTML renders of the page content, all extracted assets, a screenshot, metadata like headers/cookies/ etc., and an index file with metadata and bookmarks. This format makes it easy to view your archive offline while retaining lots of the original context.

ArchiveBox focuses on being easy to self-host for individuals and aims to be a one-click install with easy backups/exports while still offering configurability and a comprehensive feature set. It was designed with long-term preservation in mind with readability and standards compliance in mind over storing full interactive sites.

ArchiveBox Features

Features

  1. Web page archiving
  2. Media asset collection
  3. Local browsing of archived sites
  4. Scheduled archiving
  5. Deduplication
  6. Full-text search
  7. Open source

Pricing

  • Open Source

Pros

Self-hosted

Customizable

Offline browsing

Long-term preservation

Free and open source

Cons

Requires technical setup

No browser extension

Limited to individual use


The Best ArchiveBox Alternatives

Top Os & Utilities and Archiving and other similar apps like ArchiveBox


Wget icon

Wget

Wget is a command-line utility designed for non-interactive downloading of files from the internet. Recognized for its simplicity, reliability, and versatility, Wget has become a fundamental tool for users and system administrators seeking an efficient way to fetch files, mirror websites, or automate downloading tasks. One of Wget's primary strengths...
Wget image
HTTrack icon

HTTrack

HTTrack is an open source offline browser utility, which allows you to download a website from the Internet to a local directory. It recursively retrieves all the necessary files from the server to your computer, including HTML, images, and other media files, in order to browse the website offline without...
HTTrack image
Pocket icon

Pocket

Pocket is a popular read-it-later application available as a free browser extension and mobile app for iOS and Android devices. It allows users to save articles, videos, podcasts, and other content from the web to access and view at a later time.When you come across something interesting on the web,...
Pocket image
Archive.today icon

Archive.today

Archive.today is a web archiving service launched in 2012 that allows users to archive webpages and access saved versions even if the original site is inaccessible. It captures screenshots of websites and saves them externally on its servers, preserving their content for future reference.Some key features of Archive.today include:Ability to...
Archive.today image
SiteSucker icon

SiteSucker

SiteSucker is a website downloader tool designed specifically for Mac. It provides an easy way for users to save complete websites locally to their computer for offline access and archiving.Some key features of SiteSucker include:Automatically crawls links on a site to download all webpagesDownloads HTML pages, images, CSS files, JavaScript,...
SiteSucker image
Internet Archive icon

Internet Archive

The Internet Archive is a non-profit digital library that was founded in 1996 to offer permanent access for researchers, historians, scholars, people with disabilities, and the general public to historical collections in digital format. It hosts over 30 petabytes of data collected from websites, software, books, music, movies, and billions...
Internet Archive image
Web Downloader (Chrome Extension) icon

Web Downloader (Chrome Extension)

Web Downloader is a useful Chrome extension that enhances the browsing and downloading capabilities of Google Chrome. It adds a simple download button to the Chrome toolbar, allowing users to easily and quickly save files, images, videos, and even full webpages that they come across while browsing.Some key features of...
Web Downloader (Chrome Extension) image
Wallabag icon

Wallabag

wallabag is an open source web application that allows you to save web pages and articles to read later. It works similarly to other read-it-later software like Pocket or Instapaper.Some key features of wallabag include:Ability to bookmark web pages with a browser extension or by sending links to your wallabag...
Wallabag image
Evernote Web Clipper icon

Evernote Web Clipper

The Evernote Web Clipper is a browser extension available for Google Chrome, Mozilla Firefox, Microsoft Edge, and Apple Safari. It provides a quick and easy way to save web content that you want to reference later into your Evernote account.With just a click, you can clip entire web pages or...
Evernote Web Clipper image
Pinboard icon

Pinboard

Pinboard is a social bookmarking service that launched in 2009. It helps users save, organize, and manage web page bookmarks online. Some key features of Pinboard include:Bookmark saving - Users can save URLs, descriptions, tags, extended notes, and other metadata for web pages they want to bookmark for later.Full-text search...
Pinboard image
Oldweb today icon

Oldweb today

Oldweb Today is a web browser that transports users back to the early days of the internet in the 1990s. It recreates the retro look and feel of the web from that era - complete with grainy images, bright background colors, animated gifs, midi soundtrack, and websites displayed as they...
Oldweb today image
Archive.st icon

Archive.st

Archive.st is a free online web archiving service that allows users to archive web pages and access cached or historical versions of sites. It works by taking snapshots of websites over time and storing them in its archive.Some key features and uses of Archive.st include:Accessing web content that has gone...
Archive.st image
WebBites icon

WebBites

WebBites is a leading web automation and testing software that makes it easy to automate repetitive tasks, conduct cross-browser testing, and create complex scripts without coding. It has an intuitive drag-and-drop interface that allows anyone to build automated scripts in minutes.Key features include:Record and replay actions for quick automation of...
WebBites image
TheOldNet icon

TheOldNet

TheOldNet is a free and open-source web application that gives users access to archived and cached versions of websites. It serves as a proxy that retrieves pages from various web archives around the world, such as the Wayback Machine, Archive.Today, Google Cache, and more.By entering a URL into TheOldNet, it...
TheOldNet image
WebCull icon

WebCull

WebCull is a powerful yet easy-to-use web scraping and data extraction software. It enables users to extract data from websites through an intuitive graphical interface, without the need for any coding or scripting.With WebCull, users can easily scrape text, tables, images, documents, media files, and more from web pages. The...
WebCull image
Reminiscence icon

Reminiscence

Reminiscence is a free, open-source flashcard and spaced repetition software application designed to help users memorize information. It utilizes an algorithm that optimizes the review schedule based on a user's memory strength and retention of information over time.Users can create decks of digital flashcards with questions on the front and...
Reminiscence image
DoMarks icon

DoMarks

DoMarks is a user-friendly to-do list and task management app available for iOS, Android, Mac, Windows, and the web. It stands out for its intuitive and flexible interface that allows you to create multiple customizable to-do lists to fit all aspects of your life.With DoMarks, you can easily add tasks...
DoMarks image
SiteCrawler icon

SiteCrawler

SiteCrawler is a robust and versatile website crawling and scraping tool used for content mining, data extraction, website change detection, and SEO auditing. It provides an intuitive point-and-click interface to configure customized crawls through sitemaps, internal links, external links or using advanced options like regex rules.Key features include: Visual workflow...
SiteCrawler image
WebCrate icon

WebCrate

WebCrate is a user-friendly website builder designed to help small businesses, entrepreneurs, bloggers, and anyone create professional, customized websites. It provides an intuitive drag-and-drop interface that lets you easily build pages using hundreds of professionally-designed templates.Some key features of WebCrate include:Drag-and-drop page builder - No coding skills required to create...
WebCrate image
LinkAce icon

LinkAce

LinkAce is a powerful yet easy-to-use link management software designed to help individuals and teams organize, track, analyze, and share links from across the web. It serves as a central dashboard for managing all your links in one place.With LinkAce, you can create smart folders to automatically categorize links by...
LinkAce image
Snapchive icon

Snapchive

Snapchive is a privacy-focused alternative to Snapchat that was created in 2019. It offers many of the same core features as Snapchat, such as disappearing photo and video messages, but with a stronger emphasis on user privacy and security.Some of the key features that differentiate Snapchive include:End-to-end encryption for all...
Stash.ai icon

Stash.ai

Stash is an AI-powered research assistant browser extension that aims to help users search the web more efficiently. It works by allowing users to highlight text on any website, after which Stash will provide automatically generated summaries, search recommendations for further research, and source credibility ratings.Some key features of Stash...
PageArchiver icon

PageArchiver

PageArchiver is a desktop application used for archiving and preserving full websites locally for offline browsing. It features:Recursive crawling to archive entire website structuresCustom crawling rules and filtersOptions to control crawl depth and speedDownloading of HTML pages, images, CSS, JS, and other assetsFile management tools for organizing saved dataData export...
PageArchiver image
Webrecorder icon

Webrecorder

Webrecorder is an open-source web archiving software designed to enable anyone to easily capture web pages and browsing sessions for preservation and future access. It works by acting as a proxy between the user's browser and websites they visit, intercepting and storing all assets including HTML, CSS, JS, images, videos,...
Webrecorder image
Ghost Archive icon

Ghost Archive

Ghost Archive is an open-source self-hosted web archiving solution that gives you full control over creating your personal web archives. It allows you to easily save web pages to storage for long-term preservation and future access.Some key features of Ghost Archive include:Scheduled crawls - Set up recurring crawls of sites...
Ghost Archive image
SitePuller icon

SitePuller

SitePuller is a powerful web crawler and website downloader software used to copy entire websites for offline browsing, migration, analysis, and archiving purposes. Some key features include:Downloads complete websites, including text, images, CSS, Javascript, PDFs, media files, etc.Preserves original website structure and links for seamless offline accessGenerates a full site...
SitePuller image
Fossilo icon

Fossilo

Fossilo is an open-source, self-hosted knowledge base and collaboration platform for organizing information and ideas into an interconnected network. It allows users to create pages and link them together to represent concepts, notes, projects, people, organizations, etc. This linked structure helps reveal relationships, facilitate discoverability, and enable knowledge sharing.As a...
Fossilo image
ItSucks icon

ItSucks

ItSucks is an open-source software application developed as an alternative to proprietary solutions that are known to frustrate users with usability issues, missing features, bugs, and unreliability. The goal of ItSucks is to deliver an intuitive, flexible, and dependable user experience.As an open-source project, ItSucks benefits from contributions by developers...
ItSucks image
Social Feed Manager icon

Social Feed Manager

Social Feed Manager is a comprehensive social media management platform designed to help businesses and marketers streamline their social media activities. With Social Feed Manager, you can:Connect and manage multiple social media accounts like Facebook, Twitter, LinkedIn, and Instagram from one centralized dashboard.Schedule and publish content to your connected accounts...
Social Feed Manager image
WebArchives icon

WebArchives

WebArchives is an open-source software application designed specifically for archiving websites. It provides an easy way to regularly capture snapshots of websites over time so their content can be preserved, analyzed and accessed when needed. The main features include:Ability to archive one or multiple websites by URL based on a...
WebArchives image
Web Dumper icon

Web Dumper

Web Dumper is a powerful yet easy-to-use web scraping tool for extracting data from websites. With an intuitive drag-and-drop interface, Web Dumper allows anyone to build customized scrapers to scrape content, images, documents and data from web pages without writing any code.Key features of Web Dumper include:Visual scraper builder -...
Web Dumper image
Linksoutside icon

Linksoutside

Linksoutside is a cloud-based link management and sharing platform designed for teams and organizations. It provides a central place to store, organize, manage, and track all your links.Key features include:Link organizing with tags, lists, and foldersTeam collaboration tools like shared link lists and permissionsLink analytics to see which links are...
Linksoutside image
Archive-It icon

Archive-It

Archive-It is a subscription web archiving service from the Internet Archive that helps organizations harvest, build, and preserve collections of digital content. It allows libraries, scholarly institutions, and government agencies to create curated and customized captures of online content that serve their research communities.Archive-It works by crawling and archiving designated...
Archive-It image