After several frantic weeks of tidying up my music library in accordance with the tagging precepts outlined in previous posts, I’ve finally finished. Doing J S Bach alone took over a week, so I am rather relieved to have done with it for a while.
I’ve decided to upload the finished metadata in the form of a CSV, because that way it’s safe from me mucking about with the music files further. Additionally, it’s quite a nice data set that can be imported into a UTF-8 Oracle database for example. (It needs to be UTF-8 because there are a lot of foreign characters mixed in with the good, old-fashioned English stuff!)
Once it’s sitting in a real database, you can do quite interesting things with the data. I can finally show you the ‘DNA fingerprint’ of my music library, for example, like this:
That’s showing the cumulative number of seconds of music attributed to each ‘artist’ (i.e., composer) in my collection. The astonishing thing (to me, at least) is just how much J. S. Bach I have -he’s the line topping 1.3 million seconds (about 15.5 days’-worth). Compare that to Benjamin Britten …he’s the short line that stands out amongst the Bs (near the Benedetto Marcello label). At a mere 408,000 seconds (4.7 days’-worth), he appears insignificant in comparison. The same goes for what I had fondly thought was an extensive Giuseppe Verdi and Richard Wagner collection… only Wolfgang Mozart comes close, at just under the million second mark.
Another thing that you soon discover by putting your metadata into a real database is just how many little mistakes creep in: select from the library where artist<>composer, for example, and you’ll spot 39 occasions on which I’ve labelled music as being by, for example, “Gioacchino rossini” instead of “Gioacchino Rossini” (the lower-case ‘r’ makes all the difference).
So, just as you think you’ve finished, you realise that you haven’t, really. Plus the 10CDs I bought last month need adding to the catalogue A metadata technician’s lot is never really done, I guess.