2013-02-08

Why I prefer Topic Maps to RDF

I enjoy modeling data. As students, we were taught the relational data model (as used by SQL databases) and hierarchical database structures. But the real eye-opener was when our professor started modeling a supposedly simple example: an address book. Very soon, we ran into lots of questions with no easy answers: How are persons and addresses, companies, and other persons actually related? How about several persons sharing the same address? What about the temporal dimension, would you want to keep former addresses or employers? We learned what questions to ask, that there’s no silver bullet for the perfect data model, and how to choose a good compromise.

I did a lot of SQL database modeling, which was fun and powerful and easy to code against, but still relatively limited and complicated. (Think multi-valued fields and the need for separate tables for m:n relations.) So when I first read the Topic Maps specification (XTM 1.0 back in the day) and the TAO of Topic Maps article, I was thrilled. The data structures immediately made sense to me: Every thing can have names, types, properties, and identifiers. Then there’s relations between two or more things, where each thing can play a certain role. Metadata can have its own metadata, and scopes help qualify it. That’s all.

It took a few years before I could sneak a tiny Topic Map engine into our DAM software (see the blog posts). It still isn’t fully standard conformant but serves us very well: People started using it for simple lists of countries or keywords without even knowing anything about Topic Maps. (This works fine because almost every Topic Map feature is optional.) Some time later, they would notice how powerful and flexible it is: Whether hierarchical thesaurus structures, names in multiple languages, subsets of lists or custom metadata for a topic, it’s easy to think up and implement new stuff. And you don’t have to change database structures or throw away existing data.

When I learned about RDF, it totally didn’t “click” for me. Everything’s a triple? How is this better than “everything’s a row in a table”? Yes, I’m simplifying and probably not getting it – but I know that RDF doesn’t help me think. To me, it’s a low-level abstraction, too technical and too theoretical. There’s too many options for implementing basic use cases, which makes interoperability harder. Topic Maps provide me with a way to think about data structures that makes my work easier, that helps clarify my thinking and communicate it to others.

It’s a bit sad that Topic Maps have never been widely used or even known. In terms of adoption, RDF has certainly won (even though the Semantic Web is failing so far). And I love that RDFa allows embedding data structures into HTML: Now Web service APIs can be built in HTML, to be browsed by humans and still be machine readable (the ability to “view source” is a pillar of the Web). So I’ll go with keeping the data in a Topic Map, but will probably make it available through RDFa. (I hope these two can be made to play nicely together…)

Update: My experimental TopicBank engine runs this blog – see strehle.de now powered by Topic Maps.

Update: See my blog post Topic Maps (as a standard) are dead, I’m afraid.

Update: A must-read in this context is Steve Pepper’s 2002 article Ten Theses on Topic Maps and RDF. Sample: “[Topic maps] association roles also make it possible to go beyond binary relationships. In RDF, assertions are always binary.”

Fri, 08 Feb 2013 08:02:42 +0000