October 20, 2004

Can't find anything

I'm one of those people who installed google desktop and loved it, but...

I make images, and only a part of the stuff I need to store and retrieve is text. Google images is fun to play with, but completely useless for retrieving archived images on your own machine, and even it suitability for use on the web is questionable. I wrote about alternatives before.

For a while now, I've trying to work like a librarian. After all, librarians are professional store - and retrieve people. I've organized part of my my harddrive according to the Dewey Decimal Classification.

Does it work? Errrrmmm... no. One of the problems is that many of the classes are much too broad. All mammals for example, need to fit into 599, which is at the same level as onychophores 572. Now I know that onychophres have their own phylum, which mammals don't, but that would put humans in some even smaller decimal, because we'd have to organize it more like this:

590 Animalia
591 Metazoa
591.1 Onychophora
591.2 Bilateria
591.21 Deuterostomia
591.211 Chordata
591.2111 Craniata
591.21111 Vertebrata
591.211111 Mammalia
591.2111111 Primates
591.21111111 Haplorhini
591.211111111 Hominoidea
591.2111111111 Hominoids
591.21111111111 Homo

To work well, each number should take up approximately the same amount of data, and there is just no way that I can see the onychophora take up 10.000.000.000 times as much space as humans. Not on my harddrive anyway.

The Dewey Decimal Classification, unfortunately, is something you need to buy or get a subscription for, and it's $225/year. The abridged version, which would suit my library of books just fine (I have less than 20.000), is $65/year, but wouldn 't work for my files (I have hundreds of thousands). The four-volume DCC2 is $375. Ouch! The only classes you tend to find for free on the web are in the integer range. How am I supposed to know that bears belong in 599.78 and giraffes are 599.638 if these numbers are available to professional librarians only? Do I need to run to the library everytime I need to check on something? I mean, it's not as if I couldn't look it up anyway, but I keep thinking that there has got to be an easier way.

The new system has to break up everything into nicely manageable, equally sized, hierarchically structrured chunks. Or maybe not. What would Google Desktop do if I created index pages in all the directories with properly annotated links to the to-be indexed files? And could we create those index files on the fly? Of course we could. More about that next time.

Posted by mduvekot at October 20, 2004 10:49 PM
Comments