- TrickJarrett.com

Digital Archiving

11/7/2024 11:42 am

Even before this election I had been thinking about digital archiving. Decades ago I had an idea for a tool that would download my web history and maintain a local archive for me to easily recover and find things I had come across. I am very prone to the "I know I saw something about this recently..." and having to google and search my history to figure out where I had seen it.

I never followed through on that project myself, for a few reasons. The primary reason was that the search tools we had were strong enough the majority of the time.

Now, I'm thinking about digital archiving again for the same reason I have most recently - the threat of the content going away and also just as an ongoing resource in case of not having Internet access. I started playing around with Archivebox yesterday. Archivebox is very close to what I envisioned with Datacomb, short of the automation process. It seems very interesting and robust, I just need to figure out how I would use it. Whether it would build off of my self-hosted Wallabag (a selfhosted Pocket-like reader app, which grabs articles for offline reading.)

I also have a homebrew Python app I created that I called 'Wikindle.' It downloads articles from Wikipedia and converts them into Markdown, though it doesn't download any images. The idea I have for that is to eventually get an E-reader device which can store the entirety of what it downloads (which isn't the entirety of Wikipedia.) As of last night's run, it was roughly 300 megs of text, though there are a lot of articles I want to filter out still.

Share to:

| Tags: internet, archiving

"archive.today: On the trail of the mysterious guerrilla archivist of the Internet"

8/5/2023 10:36 am

Archive.today is an interesting site and one I do fall back on from time to time. Most use it for archiving, or for sharing paygated news articles, etc. I have shared those links here from time to time, but normally I don't as I do want to try and support the news sites making revenue.

This was an interesting journey down the rabbit hole trying to figure out who runs or owns this website. There aren't any massive revelations though there is a likely identity uncovered for the site's owner.

Share to:

| Tags: internet, archiving

87% of classic video games are inaccessible

7/10/2023 9:52 am

Quick facts from the study (pulled from the linked page):

87% of classic games are not in release, and are considered critically endangered

Availability is low across every platform and time period tracked in the study

Libraries and archives can digitally preserve, but not digitally share video games, and can provide on-premises access only

Libraries and archives are allowed to digitally share other media types, such as books, film, and audio, and are not restricted to on-premises access

The Entertainment Software Association, the video game industry's lobbying group, has consistently fought against expanding video game preservation within libraries and archives

Share to:

| Tags: history, archiving, video game

News Homepages

5/27/2022 6:53 am

Ben Welsh is a journalist, but he has also set up an automated tool which captures and archives newspaper frontpages on their website.

Share to:

| Tags: archiving, newspaper

2/26/2022 7:40 am

Eureka! After far more time than I expected, I have finally succeeded in getting my Twitter archive from the json export they send you into a simple mysql database. Nearly fifteen years of varying levels of inanity are now fully in my control.

Share to:

| Tags: twitter, data, archiving