Sunday, March 9, 2008

The catalog isn't broken. Really.

The catalog isn't broken. Really. It finds just what you type in the search box.

And that's the problem.

Try searching for a book on typography. According to LCSH, it doesn't exist, even though designers and publishers use it every day. Who (except librarians) would search for Type and type-founding. or | Graphic design (Typography) ? And why would Hersey's Hiroshima show up in the results? I'm confused, and I'm a librarian.

The catalog works. What doesn't work is the search.

The thesauri are so outdated that they might as well be chiseled in stone. Who searches for a term that's 30 years outdated?

The thesauri assume too much. You go to the catalog to find out about something, you shouldn't have to know about it before you search. Why even go to the catalog when you have to Google it first?

Worldcat on Google make it easier to find, but not more accessible. You still have to know the secret code. Sure, breaking the facets makes it better, but still not good, or especially usable. Search for Princeton in the default search box, and the second result is The essential Jung.

Huh? I'm still confused.

So why don't we change the terms? It ain't easy. There's no simple way to update the terms, and when they do get updated, they're usually outdated again. It's not LC's fault, it's not OCLC's fault, it's the whole system. The whole 20th century we-use-one-letter-codes-because-it-saves-a-byte system.

So what's the answer? Tagging, which is fun and helps you find your stuff, but doesn't help anyone else find your stuff? Another thesaurus? Hierarchy? Or dumping it all out and unstructuring it, mixing in the tag clouds, and sorting it with a Page-rank type ranking?

I don't know, but if you do, email me and we'll look for capital, because the demand is out there, and if libraries don't fix it soon, people will get used to going elsewhere.

They could come back, texting is just the old IRC abbreviations resurrected, it could happen. And they could fall in love with telnet and Wordstar, too. But I'm not betting on it.

What's my wishlist? The new system should be flexible, capable of near-real-time (or at least within a year) updatable, allow uncontrolled terms as well as controlled vocabularies, and allow relationships (similar to, related term).

Like RDF.

Like the semantic web. Just a corner of it. Just for now.


No comments: