ArchiveBox
ArchiveBox: Open Source Web Archiving Solution
An open source self-hosted web archiving solution for creating local, browsable copies of websites and collecting media assets.
What is ArchiveBox?
ArchiveBox is an open source self-hosted web archiving solution designed to allow anyone to easily collect and archive content from the internet to create their own personal web archive.
It works by allowing users to submit URLs which ArchiveBox will then fetch, extract assets from, render snapshots of, and archive the resulting data. The archived content can include the original HTML, PNG/JPEG snapshots, assets like JS/CSS/images, extracted text/hyperlinks, bookmarks, metadata, and more.
Once archived, all the information for a given site is organized neatly into a folder which contains plain TXT/HTML renders of the page content, all extracted assets, a screenshot, metadata like headers/cookies/ etc., and an index file with metadata and bookmarks. This format makes it easy to view your archive offline while retaining lots of the original context.
ArchiveBox focuses on being easy to self-host for individuals and aims to be a one-click install with easy backups/exports while still offering configurability and a comprehensive feature set. It was designed with long-term preservation in mind with readability and standards compliance in mind over storing full interactive sites.
ArchiveBox Features
Features
- Web page archiving
- Media asset collection
- Local browsing of archived sites
- Scheduled archiving
- Deduplication
- Full-text search
- Open source
Pricing
- Open Source
Pros
Cons
Official Links
Reviews & Ratings
Login to ReviewThe Best ArchiveBox Alternatives
View all ArchiveBox alternatives with detailed comparison →
Top Os & Utilities and Archiving and other similar apps like ArchiveBox
Here are some alternatives to ArchiveBox:
Suggest an alternative ❐Wget
HTTrack
Archive.today
SiteSucker
Internet Archive
Web Downloader (Chrome Extension)
Wallabag
Evernote Web Clipper
Pinboard
Oldweb today
Archive.st
WebBites
TheOldNet
WebCull
Reminiscence
DoMarks
SiteCrawler
WebCrate
LinkAce
Snapchive
Stash.ai
PageArchiver
Webrecorder
Ghost Archive
SitePuller
Fossilo
ItSucks
Social Feed Manager
WebArchives
Web Dumper
Linksoutside
Archive-It