I’ve found that all the web archiving software I’ve encountered are either manual (you have to archive everything individually in a separate application) or crawler-based (which can end up putting a lot of extra load on smaller web server, and could even get your ip blocked).
Are there any solutions that simply automatically archive web pages as you load them in your browser? If not, why aren’t there?
I could also see something like that being useful as a self-hosted web indexer, where if you ever go “I think I’ve seen this name before”, you can click on it, and your computer will say something like “this name appeared in a news headline you scrolled past two weeks ago”


Huh. This seems like one of those “this must exist” situations, but I can’t think of anything that does this, and a brief search suggests there may not be. The closest I could find was The Internet Archive’s Archive-IT, though it’s not an exact match. Otherwise, Archive Webpage , a pricey paid-for option (which seems like a terrible idea) appears to be the closest. OSS/self-host like Archivebox and Linkwarden don’t really do this (though you can save/send a current tab to them), and apart from that… I don’t really see anything.
Yeah, this is exactly what I was thinking, that “surely this must already be a thing”?
But yeah. I can’t think of something. I mean, its like, you’re already downloading the data. Just write it down somewhere else.