Meta Data for Clean Media Libraries
This weekend I’ve decided to do some house cleaning and reorganizing.
No, this doesn’t involve vacuuming, but it does involve a process known as “scraping”.
This weekend, I’m cleaning up all of my computer’s media files. Nearly one hundred gigabytes of music (some of which I’m looking to dump) and hundreds of gigabytes of video are sitting on my hard drive now, loosely named and somewhat organized. By the end of the weekend, I will have removed all duplicates, I will have every media file organized in the same file structure, and I will have every file named with proper conventions.
That’s Step 1.
Step 2 is “scraping”— pulling down additional meta data like fan photos of recording artists and TV shows, proper names, albums, genres, etc associated with each file, and so on. This are really two integrated steps— proper file naming and organizing makes pulling down data from the internet much easier, and pulling down proper data allows for easier file naming and organizing.
So far, the best tool I’ve found for this process is Musicbrainz’s Picard. Musicbrainz is an open content music database which contains detailed meta data for just about any music you can imagine. It works similar to programs like Shazam for the iPhone, creating a unique identifier from the data in the music file that it can match to a user-generated database. Picard taps into Musicbrainz data and looks for matches based on existing meta data in your files and the unique scans. The end result is proper meta data assigned to all of your music.
The one disadvantage of Picard is that it doesn’t also scrape information from a site like Last.fm or AllMusic that has biographies, more detailed descriptions, and pictures. I’m working on that step next.
There seem to be far more choices for file renaming and meta data scraping on the video front, but I’ve tried three solutions, none of which are seamless. I’ll report back when I have found the best tools.
This is all in preparation for an eventual setup with an HTPC or front-end box like Boxee that taps into network shared (or even better, a NAS) to throw up all of my collection in high quality to my HD TV and surround sound.