nixos-config/modules/nixos/tasks/multimedia-archive
2023-10-02 14:26:11 +08:00
..
data tasks/multimedia-archive: add jobset for Archivebox service 2022-11-21 20:33:44 +08:00
scripts chore: reformat codebase 2023-01-07 16:06:34 +08:00
default.nix treewide: remove options attribute for modules 2023-10-02 14:26:11 +08:00
README.adoc docs: update doc for multimedia archiving task 2022-11-21 10:42:14 +08:00

More like offline delivery, really. Just wait for the task to complete and you have your videos, pictures, music, and whatever questionable files you want to download. Its a nice offline repository for it especially that internet usually randomly disconnects so thats nice while I still have something working, yeah?

Project structure

The following listing block shows the files and folders that this project should have.

./modules/nixos/tasks/multimedia-archive/
├── data/
├── scripts/
├── default.nix
└── README.adoc

Some points of interests include…

Integrating with Newpipe subscriptions

In this task, I usually just download videos from YouTube. While I could note every preferred creator manually, I could automate them by getting a list of subscriptions from my Newpipe config which I use surprisingly more often than I thought. This is done by running the ./scripts/create-jobs-from-newpipe-db script and specifying the exported Newpipe database (as a ZIP file).

Caution

Please dont run the task with all of the subscriptions. You should select only a few categories and clean them up.

./scripts/create-jobs-from-newpipe-db.py ~/Downloads/NewPipeData-20220714_185126.zip

You can run the script with the -h flag for more information. There are nifty things you can do with the script. Such as the following code block which you can interactively select which folders to export.

./scripts/create-jobs-from-newpipe-db.py ~/Downloads/NewPipeData-20220714_185126.zip --list-categories \
  | fzf --multi --prompt "Choose which categories to export " \
  | ./scripts/create-jobs-from-newpipe-db.py ~/Downloads/NewPipeData-20220714_185126.zip -o ./newpipe-db.json

Remember the larger the list, the larger the chances for a throttling. Thus, it is heavily encouraged that you clean up your list (and/or get good at organizing your categories) before activating the updated version.

Exporting a jobset with OPML

There is also an easy way to export a jobset from OPML with ./scripts/create-jobs-from-rss-opml.py. The script should have similar interface with the featured script from Integrating with Newpipe subscriptions.

Heres an basic example of using the script.

./scripts/create-jobs-from-rss-opml.py ~/Downloads/MyThunderbirdFeeds-Blogs\ &\ News Feeds.opml

Take note there are assumptions to the exported subscription list.

  • The folder structure is assumed from the outline.

  • Any <outline> element with the title/text attribute is considered valid and as part of the hierarchy. Otherwise, it is assumed theyll be in a fallback category within the outline.

  • The category list is also assumed from the outline, only with the direct children of valid <outline> elements. [1]

  • Categories are also extracted from the category attribute of each RSS node. Similarly, it only extracts the head of each category hierarchy (e.g., Computers for /Computers/Science, World for /World/Countries/Philippines).

Similar to the Newpipe database script, you can do some nifty things with the script. Heres the same example from the previous featured script.

./scripts/create-jobs-from-rss-opml.py ~/Downloads/MyThunderbirdFeeds-Blogs\ &\ News Feeds.opml -l \
    | fzf --multi --prompt "Choose which categories to export " \
    | ./scripts/create-jobs-from-rss-opml.py ~/Downloads/MyThunderbirdFeeds-Blogs\ &\ News Feeds.opml

1. The category attribute from the RSS nodes is barely taken care of by most of the applications I use. WHY!?!