2012-01-09

Permanent link How Trello is different

Joel Spolsky – How Trello is different:

“The great horizontal killer applications are actually just fancy data structures.

Spreadsheets are not just tools for doing "what-if" analysis. They provide a specific data structure: a table. Most Excel users never enter a formula. […]

Word processors are not just tools for writing books, reports, and letters. They provide a specific data structure: lines of text which automatically wrap and split into pages.

PowerPoint is not just a tool for making boring meetings. It provides a specific data structure: an array of full-screen images.”

Filed under: Mon, 09 Jan 2012 12:30:38 +0100
2011-12-20

Permanent link Taxonomies don’t matter anymore

Stijn Debrouwere – Taxonomies don’t matter anymore:

“Automated recommendation engines are mainly useful as cute but non-essential pageview drivers and if your journalists are too lazy to add links.

[…] We don't come to topic pages for automatically aggregated sort-of-relevant content with no editorial guidance as to what's important and what's not. Sometimes, you just have to do things by hand, in prose.

[…] There is really no way to sidestep curation unless we don't care that we're annoying our users.

[…] Stepping away from mediocrity, for me, means putting power back in the hands of the newsroom. To make that happen, I'll be building prosthetics, not machines.”

Filed under: Tue, 20 Dec 2011 22:26:35 +0100
2011-11-23

Permanent link W3C Ontology for Media Resources 1.0

W3C Candidate Recommendation Ontology for Media Resources 1.0 (July 2011):

“The intent of this vocabulary is to bridge the different descriptions of media resources, and provide a core set of descriptive properties. This document defines a core set of metadata properties for media resources, along with their mappings to elements from a set of existing metadata formats.”

Mapped metadata standards: CableLabs 1.1, DIG35, Dublin Core, EBUCore, EXIF 2.2, ID3, IPTC NewsML-G2, LOM 2.1, Media RSS, MPEG-7, OGG, QuickTime, DMS-1, TTML, TV-Anytime, TXFeed, XMP, YouTube

Example XML for most standards can be viewed in the testsuite.

(Via Johannes Schmidt.)

Filed under: Wed, 23 Nov 2011 08:47:39 +0100
2011-09-20

Permanent link Triple bypass – What does the death of the semantic web mean for publishers?

Richard Padley of Semantico – Triple bypass – What does the death of the semantic web mean for publishers?:

"For years semantic web purists have been preaching that the future is all about RDF and triples. Yet, in the 12 years that theorists have been working on the semantic web, we've yet to see many convincing practical uses for the technology."

Filed under: Tue, 20 Sep 2011 12:36:42 +0200
2011-04-16

Permanent link Context is not a bolt-on

Stijn Debrouwere – Context is not a bolt-on:

"Topic pages, story trackers and Q&As fail because they’re never an integral part of a news website. They’re Google landing pages, designed to poach traffic from Wikipedia.

[…] What no newspapers, online or offline, seems to have perfected is how this broad, topical information stream should mesh with the daily news that’s presented on our front page.

If somebody clicks on a story and is dazzled by an array of unfamiliar names and places and events, how do we turn that experience around?"

Filed under: Sat, 16 Apr 2011 00:00:05 +0200
2011-04-15

Permanent link Interaction Models for Faceted Search

Tony Russell-Rose – Interaction Models for Faceted Search:

"Note that the facet values examined in the two-stage examples above are disjunctive (multi-select OR), e.g. the selection of a value for a facet such Make & Model does not preclude the selection of another value from the same facet. In this case, selecting multiple independent facet values has the effect of widening the search. However, if the facet values are conjunctive (multi-select AND), then the choice of which interaction model to apply is quite different. […] In this case, the only meaningful interaction model is the instant update, as this is the only approach which will ensure that facet values and the current result set stay in sync."

(Via Patrick Durusau.)

Filed under: Fri, 15 Apr 2011 23:41:07 +0200
2011-04-08

Permanent link Faceted Navigation

A List Apart – Faceted Navigation:

"The distinction between faceted navigation and parametric search is relevant. In parametric search applications, users specify their search parameters up front using a variety of controls such as checkboxes, pull-downs, and sliders to construct what effectively is an advanced Boolean query. Unfortunately, it’s hard for users to set several parameters at once, especially since many combinations will produce zero results. […] It’s a solution that’s hard on people but soft on hardware. In other words, it’s an unfortunate compromise that sacrifices immediate response to reduce the server load."

Filed under: Fri, 08 Apr 2011 14:11:46 +0200

Permanent link A Blogging Lesson For Topic Maps?

Patrick Durusau – A Blogging Lesson For Topic Maps?:

"An emphasis on giving users an immediate sense of accomplishment, with results they can use immediately could lead to a different adoption curve for topic maps."

Filed under: Fri, 08 Apr 2011 13:49:51 +0200
2011-03-28

Permanent link Pub/sub networking for enterprise awareness

Jon Udell – Pub/sub networking for enterprise awareness:

"In theory everyone talks to everyone and everything gets taken care of. In practice, as we know, not so much. Interpersonal messaging alone can’t create a resilient and discoverable web of connections. That’s why interpersonal messaging must be embedded in a pub/sub network where messages flow person-to-person, person-to-topic, topic-to-person, and topic-to-topic."

Filed under: Mon, 28 Mar 2011 12:21:40 +0200
2011-03-10

Permanent link Fixing Metadata (or Let’s Do it Right the First Time)

Kim Schroeder – Fixing Metadata (or Let’s Do it Right the First Time):

"The majority of people do not understand the work that goes into providing quality. In our current era of fast and cheap; people have lost the quality aspect almost completely. When they can not successfully execute an accurate search in their database, then they call us to fix it. I am absolutely happy to do so, but make no mistake, I wish for that collection to have done it right the first time; rather than to have called us after hundreds of hours of wasted work."

Filed under: Thu, 10 Mar 2011 21:31:45 +0100
2011-03-04

Permanent link Tags don't cut it

Stijn Debrouwere – Tags don't cut it:

"We need to re-engineer tags so that they’ll allow us to represent the rich relationships between our content and the things that content talks about. If we do that, newspapers can infuse the news with necessary context that allows readers to see the broader picture. Quite literally, too: relationship-infused content can easily be enriched with maps and timelines, which goes way beyond what tags have to offer.

Tags have a deceiving simplicity that hides their complexity as a taxonomic concept. Relationships are closer to the way journalists think about their writing. Relationships are a direct answer to the question “what is this story about?” Because they’re more intuitive than tags, they’re actually harder to mess up.

If we re-imagine tags as rich connections that relate content to the persons, organizations, locations, events and themes they talk about, hopefully magic will happen."

Filed under: Fri, 04 Mar 2011 09:40:46 +0100
2011-02-20

Permanent link Integrating taxonomies with search

Richard Padley – Integrating taxonomies with search:

"Alongside a set of search results a search engine can provide a series of drill down categories which allow the user to refine their query and cut down the result set until they find the information they need. If properly structured faceted taxonomies have been used to tag the search documents then the terms from these taxonomies can be used to provide the drill-down categories for the search engine."

(Via all things cataloged.)

Filed under: Sun, 20 Feb 2011 21:48:56 +0100
2011-02-16

Permanent link Serendipity and large video collections

Fran – Serendipity and large video collections:

"Serendipity is rarely of use to the asset manager, who wants to find exactly what they expect to find, but is a delight for the consumer or leisure searcher. People sometimes cite serendipity as a being a reason to abandon classification, but in my experience classification often enhances serendipity and can be lost in simple online search systems.

For example, when browsing an alphabetically ordered collection in print, such as an encyclopedia or dictionary, you just can’t help noticing the entries that sit next to the one you were looking for."

(Via Digital Asset Management.)

Filed under: Wed, 16 Feb 2011 10:08:16 +0100
2010-12-16

Permanent link The Content Hub

Stijn Debrouwere – Looking for a co-conspirator:

"Drupal and WordPress are perfectly fine for publishing to the web. What we want to build is a content hub for managing the gloriously messy editorial process. A content hub that loves structured data and semantic annotations. A launch pad for pushing content to any platform you can think of."

Filed under: Thu, 16 Dec 2010 10:19:59 +0100
2010-11-03

Permanent link The Many Forms of a Single Fact

William Kent back in 1988 – The Many Forms of a Single Fact:

"There is an underlying fallacy, namely the assumption that a simple binary fact (relationship or attribute) always maps simply into a pair of fields. While that is the foundation of current data design methodologies, there exist a troublesome number of exceptions in practice."

(Via Johannes Schmidt.)

Filed under: Wed, 03 Nov 2010 10:17:47 +0100
2010-09-22

Permanent link Data, not records

all things cataloged – Data, not records:

"Cataloging huge amounts of 19th century material, I often wonder: what if users had a link to the year of publication (e.g. from Wikipedia) that could provide some background information about what happened that year and could assist them in understanding the historical situation and the context a book fits into? Same for place of publication – which state was Sarajevo part of in 1894?"

Filed under: Wed, 22 Sep 2010 09:59:33 +0200
2010-09-01

Permanent link We’re in the information business

Stijn Debrouwere – We’re in the information business:

"The goal is to make our content management system like a miniature world in a snowglobe. Not just a system that publishes text, but a system that talks like we do: it knows that an interview implies one or more interviewees.

[…] An issue is more than just a number: it has a date of publication, a cover image, a chief editor, it might revolve around a special theme, it has a circulation, it has one or more cover stories. Don’t think too soon that something is just a number or merely a line of text.

[…] We need domain-specific ways of indicating, err, marking up a text. We need to start creating our own little Markdown-like languages for journalism.

[…] A well-architected news website leads to content that will keep on providing value, rather than leaving stories to wither away when their immediate news value has faded. Structured content is the stuff that makes a website malleable."

(Via Jayson Lorenzen.)

Filed under: Wed, 01 Sep 2010 13:05:48 +0200
2010-08-06

Permanent link Pretty Good Semantics

Patrick Durusau with Sam Hunting: "Our goal was to create something as simple, if not simpler than HTML 3.2 to allow users to create and annotate identifiers for entities. The result was Pretty Good Semantics."
Filed under: Fri, 06 Aug 2010 09:34:37 +0200
2009-05-13

Permanent link Google Announces Support for Microformats and RDFa

Timothy M. O'Brien at O'Reilly Radar – Google Announces Support for Microformats and RDFa:

"On Tuesday, Google introduced a feature called Rich Snippets which provides users with a convenient summary of a search result at a glance. They have been experimenting with microformats and RDFa, and are officially introducing the feature and allowing more sites to participate. While the Google announcement makes it clear that this technology is being phased in over time making no guarantee that your site's RDFa or microformats will be parsed, Google has given us a glimpse of the future of indexing."

Filed under: Wed, 13 May 2009 09:09:59 +0200
2009-02-25

Permanent link How Entity Extraction is Fueling the Semantic Web Fire

Dan McCreary at O'Reilly Broadcast – How Entity Extraction is Fueling the Semantic Web Fire:

"I have been very impressed at the scope and depth of some of the new OpenSource entity extraction tools as well as the robustness of commercial products. I thought I would discuss this since these technologies could start to move the semantic web (Web 3.0) up the hockey stick growth curve."

Filed under: Wed, 25 Feb 2009 12:10:09 +0100
2008-08-14

Permanent link Web, meet Semantic Web

Simon St. Laurent – Web, meet Semantic Web:

"The key point of [Sam] Hunting's experience, which emphasized letting users do what they wanted to do, valid or not valid, was that "People really do care about tagging - they really do tag - when they get an immediate positive result." The key phrase there is "immediate positive result." Hunting showed examples of the kinds of features that users could add easily if they were willing to take the time to add some Topic Maps markup to their documents."

Filed under: Thu, 14 Aug 2008 14:05:22 +0200
2008-06-26

Permanent link New York Times API Coming

Josh Catone at ReadWriteWeb – New York Times API Coming:

"An API is a logical next step for newspapers. It will give developers access to their vast amounts of well-researched data, and allows the paper's brand to be spread easily across the web. More access to Times content and the ability to mash it up in new and interesting ways can only be a win for both readers and the paper.

[…] Says Aron Pilhofer, the paper's interactive news editor, the goal of an API is to "make the NYT programmable. Everything we produce should be organized data.""

Filed under: Thu, 26 Jun 2008 12:13:50 +0200
2008-06-03

Permanent link Semantic Search: The Myth and Reality

Alex Iskold – Semantic Search: The Myth and Reality:

"Probably the most striking revelation about the semantic search space is User Interface. First, to go on the tangent, Powerset got it right by realizing that semantics needs to be surfaced in the UI. After a user searches Powerset, a contextual gadget, aware of the semantics of the results, helps the user complete the search experience." 

Filed under: Tue, 03 Jun 2008 22:29:37 +0200

Permanent link Drupal and The Future of News

Kurt Cagle – Drupal and The Future of News:

"The role of editor as arbiter and gate keeper is increasingly becoming automated because the taxonomy systems are becoming too complex for any one person to keep abreast of. However, this is also important because taxonomy is the new navigation, something which I believe Drupal does inordinately well. Most news sites have transcended the level where a human being can reasonably serve to build navigation, search engines face a problem of geometric expansion of content in the long term, and thus its likely that taxonomic navigation will be the dominant face of finding news moving forward.

Watch the space of stochastic taxonomic analyzers; I suspect it will be a significant growth industry in the comparatively near term. The irony of course is that in building the initial web, the metaphor most commonly used was that of the magazine, but as with any new technology, the metaphors that drove the initial adoption eventually fade away as the capabilities of the new technology shape the parameters of what can be done in that medium. Whether the existing news providers will in fact survive that transition remains to be seen."

Filed under: Tue, 03 Jun 2008 22:24:27 +0200
2008-02-11

Permanent link Calais

Calais powered by Reuters - Frequently Asked Questions:

"From a user perspective it’s pretty simple: You hand the web service unstructured text (like news articles, blog postings, your term paper, etc) and it returns semantic metadata in RDF format. What’s happening in the background is a little more complicated.

Using natural language processing and machine learning techniques, the Calais web service looks inside your text and locates the entities (people, places, products, etc), facts (John Doe works for Acme Corp) and events (Jane Doe was appointed as a Board member of Acme Corp) in the text. Calais then processes the entities, facts and events extracted from the text and returns them to the caller in RDF format."

(via Slashdot - Semantic Web Getting Real

Filed under: Mon, 11 Feb 2008 10:12:01 +0100
2007-11-23

Permanent link Giant Global Graph

Tim Berners-Lee - Giant Global Graph:

"In the long term vision, thinking in terms of the graph rather than the web is critical to us making best use of the mobile web, the zoo of wildy differing devices which will give us access to the system. Then, when I book a flight it is the flight that interests me. Not the flight page on the travel site, or the flight page on the airline site, but the URI (issued by the airlines) of the flight itself. That's what I will bookmark."

Filed under: Fri, 23 Nov 2007 16:16:00 +0100
2007-10-26

Permanent link Entity extraction everywhere

Jon Udell - Entity extraction everywhere:

"Gnosis [a Firefox extension] finds and highlights entities — that is, companies, people, products, and industry terms. Here’s an expanded view of the industry terms, products, and technologies it extracted.

I’d love to see this kind of entity extraction turn into a commodity service that we can wire into our existing email, blogging, social networking, and social bookmarking systems. Being able to easily express, in all those contexts, that twine refers to the company, or the product, not the strong kind of string, would be a huge win."

Filed under: Fri, 26 Oct 2007 09:29:17 +0200
2007-10-19

Permanent link Radar Networks Unveils twine.com

Tim O'Reilly at O'Reilly radar - Web2Summit: Radar Networks Unveils twine.com:

"Nova Spivack of Radar Networks plans to unveil the first application built on their semantic web platform, twine, a new kind of personal and group information manager. I've only seen a demo, and haven't had a chance to play with it hands-on or load in my own documents, but if it delivers what Nova promises, it could be revolutionary.

Underlying twine is Radar's semantic engine, trained to do what is called entity extraction from documents. Put in plain language, the semantic engine auto-tags each document, turning each entity into what looks like a web link as well as a tag in the sidebar. Type a note in twine, and it picks out all of the people, places, companies, books, and other types of information contained in the note, separating them out by type."

Filed under: Fri, 19 Oct 2007 09:47:30 +0200
2007-08-28

Permanent link Invent This Product

Scott Adams at The Dilbert Blog - Invent This Product:

"When the vacation is over, the scrapbook is 85% complete. You just have to check its assumptions and add/correct any descriptors you want.

You could run it as a slide show, with a little icon of a car traveling from location to location on the Google map, while the calendar date appears in the corner. When the icon reaches a destination from which there are photos, it displays them in a slide show. Optionally, the system could bring in pictures from other sources to beef up your scrapbook. For example, if you visited the Grand Canyon, it could bring in some stock pictures to round out your album. It could also capture a screen shot of the hotel or resort’s web site during the period you visited." 

Filed under: Tue, 28 Aug 2007 09:23:18 +0200
2007-08-27

Permanent link The fall of the Desktop and the File and the rise of Topical Interfaces and Topical Documents

Rick Jelliffe at XML.com - The fall of the Desktop and the File and the rise of Topical Interfaces and Topical Documents:

"The rise of Topics represents a great challenge to operating system and desktop suite vendors. When we look at Windows, or Mac or Linux window managers, we see that they really interact with the user at the wrong level. They say that the topic the user is interested in is applications and files. But how many people nowadays start their computer interaction with a web browser pointed to Google? There are still people whose organizing topic of interest in their computer interaction is the file or application, of course, but they have been swamped by people who are interested in the topic."

Filed under: Mon, 27 Aug 2007 22:15:41 +0200
2007-07-04

Permanent link Topic Map Patterns for Thesaurii

Techquila - Thesaurii:

"There are two possible patterns for the representation of a thesaurus in a topic map [...]:

  • Thesaurus Pattern 1: The Topic-Per-Term Pattern
  • Thesaurus Pattern 2: The Topic-Per-Concept Pattern"
Filed under: Wed, 04 Jul 2007 09:38:47 +0200
2007-06-27

Permanent link Resource Description and Classification

Cover Pages - Resource Description and Classification:

"Being a collection of references on matters of Subject Classification, Taxonomies, Ontologies, Indexing, Metadata, Metadata Registries, Controlled Vocabularies, Terminology, Thesauri, Business Semantics. A collection of references and survey based upon links and cribbings from various resources on the Internet."

Filed under: Wed, 27 Jun 2007 14:32:57 +0200
2007-06-04

Permanent link Elastic lists

Moritz Stefaner - Elastic lists:

"Elastic lists enhance traditional facet browsing approaches by

  • visualizing relative proportions (weights) of metadata values by size
  • visualizing unusualness of a metadata weight by brightness
  • and animated filtering transitions."

(Via Ryan Eby.)

Filed under: Mon, 04 Jun 2007 13:17:23 +0200
2007-05-07

Permanent link Tagging is declarative programming for everybody

Jon Udell - Tagging is declarative programming for everybody:

"Among other things, tagging may become to ordinary folks what attributes are becoming to programmers: a language that doesn’t just describe things, but also invokes and coordinates behaviors."

Filed under: Mon, 07 May 2007 11:00:54 +0200
2007-03-27

Permanent link Like a moth to the Freebase flame

Jon Udell - Like a moth to the Freebase flame:

"I created my first user-defined Freebase type. Because the system is so new, there are some quite fundamental things that (so far as I can see) haven’t yet been defined. I wanted to create entries for some of my personal projects, such as LibraryLookup and elmcity.info, so I created a type called Project and added the properties Goal and Collaborators. That enabled me to add entries for my two personal projects, describe their goals, and associate myself with them as a collaborator."

Filed under: Tue, 27 Mar 2007 00:54:21 +0200
2007-03-09

Permanent link Freebase Will Prove Addictive

Tim O'Reilly - Freebase Will Prove Addictive:

"But once you understand a bit about what metaweb is doing, you realize just how remarkable it is. Metaweb has slurped in the contents of several of the web's freely accessible databases, including much of wikipedia, and song tracks from musicbrainz. It then turns its users loose on not just adding more data items but making connections between them by filling out meta tags that categorize or otherwise connect the data items, using a typology that can be extended by users, wiki-style."

See: freebase.com

Filed under: Fri, 09 Mar 2007 12:28:11 +0100
2007-02-14

Permanent link Introducing RDFa

Bob DuCharme - Introducing RDFa:

"For a long time now, RDF has shown great promise as a flexible format for storing, aggregating, and using metadata. Maybe for too long—its most well-known syntax, RDF/XML, is messy enough to have scared many people away from RDF. The W3C is developing a new, simpler syntax called RDFa (originally called "RDF/a") that is easy enough to create and to use in applications that it may win back a lot of the people who were first scared off by the verbosity, striping, container complications, and other complexity issues that made RDF/XML look so ugly."

Filed under: Wed, 14 Feb 2007 23:58:49 +0100
2007-01-16

Permanent link The Need for Creating Tag Standards

The NeoSmart Files - The Need for Creating Tag Standards:

"Basically, it’s too late for a tagging standard that will be used unanimously throughout the web. A truly semantic web most certainly won’t ever exist because of the reluctance to change and the unwillingness to compromise and accept defeat. A semantic web requires objective analysis of methods and data, culminating in honestly evaluated options, and immediate acceptance of the outcome. But that’s never going to happen."

Filed under: Tue, 16 Jan 2007 00:14:27 +0100
2007-01-03

Permanent link Microformats - Part 3: Introducing Operator

Alex Faaborg - Microformats - Part 3: Introducing Operator:

"Today Mozilla Labs is releasing Operator, a microformat detection extension developed by Michael Kaply at IBM. Operator demonstrates the usefulness of semantic information on the Web, in real world scenarios."

Filed under: Wed, 03 Jan 2007 22:33:34 +0100
2006-12-06

Permanent link Microformats Icons

Wolfgang Bartelme  - Microformats Icons:

 "As Microformats have gained much popularity over the last year we thought it was time to standardize the way they are represented on a website. So we created the Microformats Icon Set. The starter set contains icons for hCal, hResume, hCard, XFN and a generic TAG icon."

Filed under: Wed, 06 Dec 2006 23:52:04 +0100
2006-11-29

Permanent link We need a universal canvas that doesn't suck

Jon Udell at InfoWorld - We need a universal canvas that doesn't suck:

"While e-mail dissolves barriers to the exchange of data, we need another solvent to dissolve the barriers to collaborative use of that data. Applied in the right ways, that solvent creates what I like to call the “universal canvas” -- an environment in which data and applications flow freely on the Web.

Here’s the best definition of the universal canvas: “Most people would prefer a single, unified environment that adapts to whichever environment they are working in, moves transparently between local and remote services and applications, and is largely device-independent -- a kind of universal canvas for the Internet Age.”

You might expect to find that definition in a Google white paper from 2006. Ironically, it comes from a Microsoft white paper from 2000, announcing a “Next Generation Internet” initiative called .Net."

Filed under: Wed, 29 Nov 2006 15:01:50 +0100
2006-11-13

Permanent link More structured metadata

Jenn Riley - More structured metadata:

"I often encounter people who see my job title (Metadata Librarian) and assume I have an agenda to do away with human cataloging entirely and rely solely on full-text searching and uncontrolled metadata generated by authors and publishers. That’s simply not true; I have no such goal. I am interested in exploring new means of description, not for their own sake, but for the retrieval possibilities they suggest for our users.

[...] I’m a big fan of faceted browsing. The ability to move seamlessly through a system, adding and removing features such as language, date, geography, topic, instrumentation (hey, I’m a musician…), and the like based on what I’m currently seeing in a result set is something I believe our users will be demanding more and more. But we can’t do this if that information isn’t explicitly coded."

Filed under: Mon, 13 Nov 2006 13:01:40 +0100
2006-11-08

Permanent link The Next Web?

Simon St. Laurent at XML.com - The Next Web?:

"Developers who craft smart APIs on their servers for use by AJAX-based web pages can then expose those APIs to other developers, getting the benefits of better interfaces for users who use web browsers to consume the data and for users who have their own custom programs consuming the data. Depending on how carefully the developer models AJAX transactions on traditional web HTTP transactions, these services even look a lot like the REST approach proposed earlier for web services."

Filed under: Wed, 08 Nov 2006 16:32:00 +0100
2006-10-12

Permanent link Topic Maps, Knowledge, and OpenCyc

W. Eliot Kimber - Topic Maps, Knowledge, and OpenCyc:

"I think that topic maps are useful and attractive as far as they go: for the general business problem of managing metadata and associating it with data objects, it's well suited and well thought out.

Why do I think that topic maps (and anything similar, such as RDF) is not suitable for knowledge representation?

For the simple reason that knowledge representation is much more sophisticated and subtle than just topics with associations. "

Filed under: Thu, 12 Oct 2006 00:06:12 +0200
2006-09-22

Permanent link Lucene Summit: NINES Collex

Erik Hatcher - Lucene Summit:

"I really found the Collex interface concept to be very interesting. Everything is a contraint or limit and you can easily add or invert the contraint. It’s also easy to add things to a personal collection and parts of the personal collection then become facets/contraints themselves. He’s really using all of the metadata (archive and user) to it’s full extent. He also has more plans including “exhibits” where people can “curate collections”. These collections themselves can then become objects in the index and so on. "

Filed under: Fri, 22 Sep 2006 16:23:00 +0200
2006-08-23

Permanent link Del.icio.us is a database

Jon Udell - Del.icio.us is a database:

"Although it's intuitively obvious to me, I suspect that most people don't yet appreciate how easily, and powerfully, tagging systems can work as databases for personal (yet shareable) information management.

Del.icio.us isn't simply backed by a database, it can function as a database to which you add (a lot of) queryable columns.

[...] It strikes me that there's a sweet spot somewhere between this shoestring approach and the likes of Dabble DB, an application that offers powerful web-based data management. Consider how dBase and later Access were overkill for most people's recipe lists and address books, and how 1-2-3 and Excel wound up meeting the need instead. Tag systems might turn out to be the spreadsheets of modern information management."

Filed under: Wed, 23 Aug 2006 00:14:57 +0200
2006-08-17

Permanent link Semantic MediaWiki

"Semantic MediaWiki introduces some additional markup into the wiki-text which allows users to add "semantic annotations" to the wiki. While this first appears to make things more complex, it can also greatly simplify the structure of the wiki, help users to find more information in less time, and improve the overall quality and consistency of the wiki. To illustrate this, we provide some examples from the daily business of Wikipedia:

[...] Inflationary use of categories. The need for better structuring becomes apparent by the enormous use of categories in Wikipedia. While this is generally helpful, it has also lead to a number of categories that would be mere query results in SMW. For some examples consider the categories Rivers in Buckinghamshire, Asteroids named for people, and 1620s deaths, all of which could easily be replaced by simple queries that use just a handful of annotations. Indeed, in this example Category:Rivers, Relation:located in, Category:Asteroids, Category:People, Relation:named after, and Attribute:date of death would suffice to create thousands of similar listings on the fly, and to remove hundreds of Wikipedia categories."

Filed under: Thu, 17 Aug 2006 16:47:53 +0200
2006-07-18

Permanent link Why I Hate Microformats

Robert Cooper - Why I Hate Microformats:

"Yay, you have an iCal microformat in your page. You can use Trails, now to stick it right into your Google calendar. Neat.

The problem is, this is a serious abuse of HTML. The way you SHOULD have done this is:

    <html:div xmlns="http://www.w3.org/2002/12/cal/">
       <vevent>
       <dtstart>20060501</dtstart><html:abbr>May 1</html:abbr>
    ...

Then present your iCal entry with CSS. Yes, we have waited years and years and years for Microsoft to get off their rears and implement CSS with namespaces, which everyone else has had for years. However, IE7 is around the proverbial corner, and we should finally get the option to embed actual real data into our HTML pages and style it. There is no reason to use semantically incorrect HTML and beat up on the class attribute."

Filed under: Tue, 18 Jul 2006 12:40:38 +0200
2006-07-05

Permanent link SIMILE Project

"SIMILE is focused on developing robust, open source tools based on Semantic Web technologies that improve access, management and reuse among digital assets."

Filed under: Wed, 05 Jul 2006 22:12:52 +0200
2006-06-09

Permanent link The 7 (f)laws of the Semantic Web

Dan Zambonini at XML.com - The 7 (f)laws of the Semantic Web:

"Creating metadata and classifications is difficult (let’s not get started on Ontologies). People are biased (whether they mean to be or not), and fallible. Metadata, which the Semantic Web relies on, is not always going to be of great quality.

[...] My clients don’t want to create ontologies. They don’t want to map one set of data to another. They want to use something that’s out there and ready for them to use, and will give them the maximum benefit (so if the Imperial War Museum say that they have a tank from “World War One” and the Science Museum has a video of the firing mechanism from a gun from “World War One”, they can both use the same term/URI)."

Filed under: Fri, 09 Jun 2006 23:03:47 +0200
2006-05-30

Permanent link PHPTMAPI

"PHPTMAPI implements a PHP API for manipulating topic maps, based on the TMAPI project."

Filed under: Tue, 30 May 2006 14:28:06 +0200
2006-05-14

Permanent link Topincs

"Topincs is a Topic Map authoring tool, that allows groups to create Topic Maps in Firefox. Even though it is run in an ordinary browser window it feels like an application installed on your computer. [...] It consists of a client, for editing maps and a server, for storing them. [...] The Server requires Apache 2, PHP 4 and MySQL."

Filed under: Sun, 14 May 2006 22:19:28 +0200
2006-05-04

Permanent link Accessing the web of databases

Jon Udell at InfoWorld - Accessing the web of databases:

"I’ve always regarded the Web as a programmable data source as well as a platform for the document/software hybrid that we call a Web page. Early on, programmable access to Web data entailed a lot of screen scraping. Nowadays it often still does, but it’s becoming common to find APIs that serve up the Web’s data.

[...] Free text search is an even more popular access API. Nearly every site provides that service, or outsources it to Google or another engine.

[...] What you can’t typically do, though, is create mashups by running ad hoc queries against remote Web data. There are good reasons to think that it’s just crazy to export open-ended query interfaces over the Web. No responsible enterprise DBA would permit such access to the crown jewels. But there are all kinds of data sources -- or what Idehen likes to call data spaces -- and a range of feasible and appropriate access modes."

Filed under: Thu, 04 May 2006 17:05:43 +0200
2006-04-05

Permanent link Reinventing the intranet

Jon Udell at InfoWorld - Reinventing the intranet:

"Inside the enterprise, teams, tasks, products, and services define metadata vocabularies that the Internet search giants would kill for. Exploiting those vocabularies to deliver search results that are better than what’s available on the open Web is low-hanging fruit. As we roll out SOAs that route well-formed messages through a fabric of intermediaries, it’ll get even easier."

Filed under: Wed, 05 Apr 2006 14:51:00 +0200
2006-04-04

Permanent link Onlife

"Onlife is an application for the Mac OS X that observes your every interaction with apps such as Safari, Mail and iChat and then creates a personal shoebox of all the web pages you visit, emails you read, documents you write and much more. Onlife then indexes the contents of your shoebox, makes it searchable and displays all the interactions between you and your favorite apps over time."

Filed under: Tue, 04 Apr 2006 23:41:43 +0200
2006-03-26

Permanent link Image Annotation on the Semantic Web

W3C - Image Annotation on the Semantic Web:

"The goals of this document are (i) to explain what the advantages are of using Semantic Web languages and technologies for the creation, storage, manipulation, interchange and processing of image metadata, and (ii) to provide guidelines for doing so. The document gives a number of use cases that illustrate ways to exploit Semantic Web technologies for image annotation, an overview of RDF and OWL vocabularies developed for this task and an overview of relevant tools."

Filed under: Sun, 26 Mar 2006 23:29:48 +0200
2006-02-17

Permanent link IBM Initiative to Capture New Growth Opportunities in Information Management

"IBM today announced a company-wide initiative that combines its software and industry consulting expertise to help clients better compete in the global economy through uninhibited access to accurate, reliable and trustworthy business information.

[...] Additionally, IBM is announcing six new solution portfolios and new software products to help clients transform their businesses from an outdated model in which data is managed as an afterthought from within applications, to an environment in which information is set free and managed as a strategic asset and to drive better decision making."

Filed under: Fri, 17 Feb 2006 16:57:20 +0100
2006-01-20

Permanent link Semapedia

Semapedia.org - The Physical Wikipedia: "Our goal is to connect the virtual and physical world by bringing the best information from the internet to the relevant place in physical space. We do this by combining the physical annotation technology of Semacode with high quality information from Wikipedia."

Filed under: Fri, 20 Jan 2006 00:18:49 +0100
2006-01-14

Permanent link Organizing Files

Karl Vogel at ONLamp.com - Organizing Files:

"The problem: the filesystem on my Unix workstation was a mess. I couldn't find anything without grepping all over creation. About half the time, I'd actually find something useful. Usually I'd get no hits at all, or I'd match something like a compiled binary and end up hosing my display beyond belief.

[...] I went so far as to buy a copy of the Abridged Dewey Decimal Catalog, which is actually pretty nifty; if you're looking to organize your paper files, you could do a lot worse than use an existing classification scheme like this.

[...] My job as a system administrator doesn't change every day, but it's much easier to keep track of things via date rather than via subject. I tend to remember things in time order, so I finally stopped trying to change the way I work to fit some hierarchy. Instead, I made a directory structure on the machine to match my work habits."

Filed under: Sat, 14 Jan 2006 23:20:31 +0100
2005-11-30

Permanent link SKOS Core

SKOS Core provides a model for expressing the basic structure and content of concept schemes such as thesauri, classification schemes, subject heading lists, taxonomies, ‘folksonomies’, other types of controlled vocabulary, and also concept schemes embedded in glossaries and terminologies.

The SKOS Core Vocabulary is an application of the Resource Description Framework (RDF), that can be used to express a concept scheme as an RDF graph.”

Filed under: Wed, 30 Nov 2005 15:13:00 +0100
2005-11-17

Permanent link Google Base is interesting

Simon Willison - Google Base is interesting:

“Base is a very interesting product for a whole bunch of reasons. The data model is surprisingly simple on the surface: all items have a title, description, (optional) external URL, a “type” and a set of labels (a.k.a. tags) and “attributes". Attributes are something for tag enthusiasts to get excited by - they’re name/value pairs that are kind of like tags in that you can apply them to anything, but more structured and with a greater level of implied meaning.

[…] There’s definitely a trend towards this kind of loose data model at the moment. JotSpot allows all pages within a wiki to have as many extra name/value attribute pairs as you like (even the wiki body itself is internally implemented as a special attribute), and Ning works along similar lines.”

Filed under: Thu, 17 Nov 2005 13:19:00 +0100
2005-11-16

Permanent link Dabble DB

Dabble combines the best of group spreadsheets, custom databases, and intranet web applications into a new way to manage and share your information online.”

Filed under: Wed, 16 Nov 2005 23:38:00 +0100

Permanent link Google Base

Google Base is a place where you can easily submit all types of online and offline content that we’ll host and make searchable online. You can describe any item you post with attributes, which will help people find it when they search Google Base. In fact, based on the relevance of your items, they may also be included in the main Google search index and other Google products like Froogle, Google Base and Google Local.”

Filed under: Wed, 16 Nov 2005 11:29:00 +0100
2005-11-07

Permanent link Wikidata

Wikidata is a proposed wiki-like database for various types of content. This project as proposed here requires significant changes to the software (or possibly a completely new software) but has the potential to centrally store and manage data from all Wikimedia projects, and to radically expand the range of content that can be built using wiki principles.”

Filed under: Mon, 07 Nov 2005 22:30:00 +0100
2005-11-01

Permanent link Learning from THE WEB

Adam Bosworth at ACM Queue - Learning from THE WEB:

“Successful systems on the Web are bottom-up. They don’t mandate much in a top-down way. Instead, they control themselves through tipping points. For example, Flickr doesn’t tell its users what tags to use for photos. Far from it. Any user can tag any photo with anything (well, I don’t think you can use spaces). But, and this is a key but, Flickr does provide feedback about the most popular tags, and people seeking attention for their photos, or photos that they like, quickly learn to use that lexicon if it makes sense. It turns out to be amazingly stable.

[…] It is time that the database vendors stepped up to the plate and started to support a native RSS 2.0/Atom protocol and wire format; a simple way to ask very general queries; a way to model data that encompasses trees and arbitrary graphs in ways that humans think about them; far more fluid schemas that don’t require complex joins to model variations on a theme about anything from products to people to places; and built-in linear scaling so that the database salespeople can tell their customers, in good conscience, for this class of queries you can scale arbitrarily with regard to throughput and extremely well even with regard to latency, as long as you limit yourself to the following types of queries. Then we will know that the database vendors have joined the 21st century.”

Filed under: Tue, 01 Nov 2005 22:10:00 +0100
2005-10-27

Permanent link Managing metadata

Jon Udell at InfoWorld - Managing metadata:

"Everyone knows the common definition: Metadata is data about data, a secondary thing that's separate in some way from the primary thing to which it refers. But that definition begs a series of questions. Is metadata something we derive from data, or assign to it? Does it classify things, or enable us to search for things, or govern the behavior of things? If data that is described by metadata also, in turn, refers to other data, does it then qualify as both data and metadata?

These questions can verge on the philosophical, but by working through some examples, we can define various types of metadata, list the benefits that we expect from using it, and identify the challenges associated with maintaining it. Programs, documents, messages, files, Web resources, and Web services are some of the IT constructs often described by metadata. Let's review the roles that metadata can play in these different scenarios."

Filed under: Thu, 27 Oct 2005 16:40:00 +0200

Permanent link Point. Shoot. Kiss It Good-Bye.

David Weinberger at Wired - Point. Shoot. Kiss It Good-Bye.:

"As you pass the locked entrances to rooms - caverns, actually - that encompass entire patent-application warehouses and film libraries, you feel like you're navigating through the brain of a slumbering giant. And there, in one of its farthest recesses,is where the beast stores the 11 million photographs that constitute the Bettmann Archive, perhaps the best-known collection of photos in the world.

Although the photos are kept in one room, their sheer quantity means that locating any one of them requires an elaborate ritual. Suppose you want to find an image of President Coolidge talking with Native Americans. First, researcher Robinya Roberts looks up "Coolidge" in a central card catalog that looks like it's been transplanted from your local library to the Bat Cave. Yellowed and worn, the 3-by-5 cards contain surprisingly little information: only a caption, a brief description, and a reference number.

[…] This process of manual metadata tagging, subjective and labor-intensive, may work for Corbis, but it's a lot to ask of the rest of us. Even when software developers try to make it easy, it's not easy enough. For instance, Adobe Photoshop Album offers a similar type of drag-and-drop labeling. Right now, you have to enter keywords manually; presumably someday you'll be able to upload the names of people, places, and events from your address book and calendar so at least you can drag and drop familiar names. Still, mere mortals don't have a 60,000-term online taxonomy or twin screens. More to the point, we don't want to hire Nick Fraser to do the job."

Filed under: Thu, 27 Oct 2005 15:03:00 +0200
2005-09-08

Permanent link WinFS and social information management

Jon Udell at InfoWorld - WinFS and social information management:

“I saw my first demo of Microsoft’s Cairo OFS (Object File System) back in 1993. It was briefly unveiled at the Professional Developers Conference that year, and then shelved. This week I installed the beta version of its successor, WinFS.”

Filed under: Thu, 08 Sep 2005 16:27:00 +0200
2005-05-31

Permanent link Introduction to XFML

Peter Van Dijck at XML.com - Introduction to XFML:

"XFML is a simple XML format for exchanging metadata in the form of faceted hierarchies, sometimes called taxonomies. Its basic building blocks are topics, also called categories. XFML won't solve all your metadata needs. It's focused on interchanging faceted classification and indexing data."

Filed under: Tue, 31 May 2005 14:24:13 +0200
2005-04-23

Permanent link Bosworth's Web of Data

At ONLamp.com, Daniel H. Steinberg summarizes Adam Bosworth's keynote at the MySQL Users Conference 2005:

"Adam Bosworth suggested that we "do for information what HTTP did for user interface." [...] As a result of a simple, sloppy, standards-based, scalable platform, we have information at our fingertips from Google, Amazon, eBay, and Salesforce. Bosworth's own company, Google, gets hundreds of millions of hard queries a day. He said they see it as putting Ph.Ds in tanks to drive through walls rather than around them.

In addition to the advantages in software, there have been great gains in hardware. Bosworth said that one million dollars buys you five hundred machines with 2TB of in-memory data, a PetaByte of on-disk data, and a reasonable throughput of fifty thousand requests per second. This amounts to one billion requests per day. Having this sort of power changes the way you think."

Filed under: Sat, 23 Apr 2005 21:45:31 +0200
2005-04-15

Permanent link MicroFormats

"microformats are:

* a way of thinking about data * design principles for formats * adapted to current behaviors and usage patterns * highly correlated with semantic xhtml, AKA the real world semantics, AKA lowercase semantic web, AKA lossless XHTML"

Take a look at the hCalendar example.

Filed under: Fri, 15 Apr 2005 10:55:18 +0200
2005-03-17

Permanent link High order bits and Ontologies

Robert Kaye - High order bits and Ontologies:

"Then later in the afternoon, Clay Shirky talked about the difference between ontologies and folksonomies in his "Ontology is Overrated: Links, Tags, and Post-hoc Metadata". With his usual flair Clay delivered a great overview of classic ontologies and all the issues that limit their usefulness on the Internet. [...]

Clay went on to outline the conditions under which classical ontologies can thrive:

* Domain: small corpus, formal categories, stable entities, restricted entities, clear edges * Participants: Coordinated users, expert users, expert catalogers, authoritative sources

In a nutshell, ontologies work best in small and controlled environments where experts are using the system. Unfortunately, the Internet is the the exact opposite of all of these. And thus, argues Clay, ontologies are not suited for the Internet. Fortunately, the Internet has brought us a solution to all these problems in the form of Folksonomies."

Filed under: Thu, 17 Mar 2005 23:26:28 +0100
2005-02-10

Permanent link Topic Map Solutions for Kodak Digital Camera Accessories

Nikita Ogievetsky's (Cogitech, Inc.) and Terry Badger's (Eastman Kodak Company) XML Europe 2003 presentation on Topic Map Solutions for Kodak Digital Camera Accessories:

"This presentation shows how Topic Map based solutions are used to build, organize and maintain Kodak digital cameras accessories web site. The chosen approach did not require software investment. Excel, an available and familiar spreadsheet software was used as an affordable and easy to use Topic Map GUI editor and repository. [...] All processing is done with XSLT scripts."

Filed under: Thu, 10 Feb 2005 11:11:09 +0100
2005-02-01

Permanent link Topic Map Relational Query Language (TMRQL)

Graham Moore, Kal Ahmed: "Topic Map Relational Query Language [PDF] (TMRQL) has been designed in order to provide a sound foundation for querying topics maps. To this end it does not define an entire new language but instead presents a core set of abstract relational views. The relational model provides a firm foundation for the development of a topic map query language.

Development in this direction would lead to a more accessible and usable language by a greater number of developers than a new and bespoke language. Developers would be familiar with the concepts and their existing tools would work with the data structures returned. To them, the topic map data model would appear as just another schema or view. In order that the TMRQL language is not bound to a single implementation schema, nor even, bound to a relational database implementation we define a set of Relational Views that provide an abstract relational model of the topic map data model. This abstract data structure is independent of any particular implementation yet provides a foundation to use the full power of the SQL language and helps with portability of TMRQL queries."

Filed under: Tue, 01 Feb 2005 15:11:49 +0100
2005-01-14

Permanent link As We May Think

Vannevar Bush's legendary essay from 1945, As We May Think:

"Consider a future device for individual use, which is a sort of mechanized private file and library. It needs a name, and to coin one at random, "memex'' will do. A memex is a device in which an individual stores all his books, records, and communications, and which is mechanized so that it may be consulted with exceeding speed and flexibility. It is an enlarged intimate supplement to his memory.

[...] It affords an immediate step, however, to associative indexing, the basic idea of which is a provision whereby any item may be caused at will to select immediately and automatically another. This is the essential feature of the memex. The process of tying two items together is the important thing.

When the user is building a trail, he names it, inserts the name in his code book, and taps it out on his keyboard. [...]

Thereafter, at any time, when one of these items is in view, the other can be instantly recalled merely by tapping a button below the corresponding code space. Moreover, when numerous items have been thus joined together to form a trail, they can be reviewed in turn, rapidly or slowly, by deflecting a lever like that used for turning the pages of a book. It is exactly as though the physical items had been gathered together to form a new book. It is more than this, for any item can be joined into numerous trails."

Filed under: Fri, 14 Jan 2005 15:26:32 +0100

Permanent link Lifestreams

By Eric Freeman and David Gelernter back in 1997: "Lifestreams is built on a simple storage metaphor --- a time-ordered stream of documents combined with several powerful operators --- that replaces many conventional computer constructs (such as named files, directories, and explicit storage) and in the process provides a unified framework that subsumes many separate desktop applications to accomplish and handle personal communication, scheduling, and search and retrieval tasks. While our current prototype is tailored to managing personal information, a "lifestream" is also a natural framework for managing enterprise information and web sites; we are just beginning to explore such use."

Filed under: Fri, 14 Jan 2005 15:16:25 +0100
2005-01-10

Permanent link TMCore05

"TMCore05 allows developers to take full advantage of the power of topic maps in their applications. The engine provides a robust store for multiple topic maps; an extensive API accessible via any language supported by the Microsoft CLR; and a high-level web services interface that allows both reading and updating of topic maps using SOAP-based web service calls.

The engine makes use of Microsoft SQLServer 2000 to provide scalable, persistent storage and is designed to allow multiple instances to access the same data store simultaneously using an optimistic locking strategy to minimize development overhead."

Filed under: Mon, 10 Jan 2005 17:28:53 +0100
2004-12-06

Permanent link Tyranny of the geeks

Sriram Krishnan - Tyranny of the geeks:

"Nowadays, it is the 'in'-thing to be CSS-aware. If you're dumb enough to use a table tag, you're branded as a clueless moron. However, no one really tells you why table tags are bad. In fact, the equivalent CSS for generating something like your standard sign-up form is downright scary. And with every browser (Opera, Firfox, IE) having a different idea on what 'right' CSS is, you're much safer with table tags. For those using CSS and use divs and floats to build their tables, I ask them why. Why do something that is so un-intuitive? I could teach a kid about rows and columsn.

[...] A year ago, I read up a lot on the Semantic Web and RDF. I have to admit that I didn't understand any of it. Any of it. Ontologies, RDF, OWL, what not. However, you see blogs and enclosures getting the same effect with only a fraction of the complexity. I dont need smart agents to find what I want - I just search in Google and it is usally smart enough to give me what I need. I dont have high hopes for the semantic web unless they simplify and do it real soon."

Filed under: Mon, 06 Dec 2004 12:45:32 +0100
2004-11-22

Permanent link Adam Bosworth's ISCOC04 Talk

Adam Bosworth - ISCOC04 Talk:

"That software which is flexible, simple, sloppy, tolerant, and altogether forgiving of human foibles and weaknesses turns out to be actually the most steel cored, able to survive and grow while that software which is demanding, abstract, rich but systematized, turns out to collapse in on itself in a slow and grim implosion.

[...] What is more, in one of the unintended ironies of software history, HTML was intended to be used as a way to provide a truly malleable plastic layout language which never would be bound by 2 dimensional limitations, ironic because hordes of CSS fanatics have been trying to bind it with straight jackets ever since, bad mouthing tables and generations of tools have been layering pixel precise 2 dimensional layout on top of it. And yet, ask any gifted web author, like Jon Udell, and they will tell you that they often use it in the lazy sloppy intuitive human way that it was designed to work. They just pour in content. In 1996 I was at some of the initial XML meetings. The participants' anger at HTML for "corrupting" content with layout was intense. Some of the initial backers of XML were frustrated SGML folks who wanted a better cleaner world in which data was pristinely separated from presentation. In short, they disliked one of the great success stories of software history, one that succeeded because of its limitations, not despite them. I very much doubt that an HTML that had initially shipped as a clean layered set of content (XML, Layout rules - XSLT, and Formatting- CSS) would have had anything like the explosive uptake.

Now as it turns out I backed XML back in 1996, but as it turns out, I backed it for exactly the opposite reason. I wanted a flexible relaxed sloppy human way to share data between programs and compared to the RPC's and DCOM's and IIOP's of that day, XML was an incredibly flexible plastic easy going medium. It still is. And because it is, not despite it, it has rapidly become the most widely used way to exchange data between programs in the world. And slowly, but surely, we have seen the other older systems, collapse, crumple, and descend towards irrelevance.

Consider programming itself. There is an unacknowledged war that goes on every day in the world of programming. It is a war between the humans and the computer scientists. It is a war between those who want simple, sloppy, flexible, human ways to write code and those who want clean, crisp, clear, correct ways to write code. It is the war between PHP and C /Java. It used to be the war between C and dBase. Programmers at the level of those who attend Columbia University, programmers at the level of those who have made it through the gauntlet that is Google recruiting, programmers at the level of this audience are all people who love precise tools, abstraction, serried ranks of orderly propositions, and deduction. But most people writing code are more like my son. Code is just a hammer they use to do the job. PHP is an ideal language for them. It is easy. It is productive. It is flexible. Associative arrays are the backbone of this language and, like XML, is therefore flexible and self describing. They can easily write code which dynamically adapts to the information passed in and easily produces XML or HTML.

[...] I remember listening many years ago to someone saying contemptuously that HTML would never succeed because it was so primitive. It succeeded, of course, precisely because it was so primitive. Today, I listen to the same people at the same companies say that XML over HTTP can never succeed because it is so primitive. Only with SOAP and SCHEMA and so on can it succeed. But the real magic in XML is that it is self-describing. The RDF guys never got this because they were looking for something that has never been delivered, namely universal truth."

Filed under: Mon, 22 Nov 2004 14:56:47 +0100
2004-10-14

Permanent link Intertwingle

Jamie Zawinski outlines a hypothetical program - vast volumes of email:

"There are other interesting data-visualization possibilities here as well; since really what we have is nodes and connections between them, tools like graphers and histogram charts might be applicable as well, to answer questions like

* show me a graph of the age-distribution of my unanswered mail, or,

* show me a graph of people who are known to have directly exchanged mail with each other so that I can see the "clumping'' of my correspondents."

Filed under: Thu, 14 Oct 2004 14:44:48 +0200
2004-10-10

Permanent link Isaac Newton, sha1, and the Semantic Web

David Sklar - Isaac Newton, sha1, and the Semantic Web:

"Which made me think: is the Semantic Web the 21st century equivalent of Diderot's Encyclopédie? What lessons have we learned (or not) from previous generations' attempts to taxonomify (and neologize? :) all information?"

Filed under: Sun, 10 Oct 2004 07:50:27 +0200
2004-09-30

Permanent link Drowned out by keywords

Edd Dumbill - Drowned out by keywords:

"So here's a case for the semantic web. It's stupidly difficult to search for news of my hometown.

I live in the beautiful city of York, UK. In most search oriented applications I cannot search for my city. Why?

Because "New York" always matches a search for "York", too."

Filed under: Thu, 30 Sep 2004 22:53:41 +0200
2004-09-06

Permanent link Hypertext '96 Conference Keynote

Interesting hypertext/hypermedia look-back by Randall H. Trigg, Xerox Palo Alto Research Center - Hypermedia as Integration: Recollections, Reflections and Exhortations:

"I had decided that hypertext links needed "types" (really "labels") that could distinguish in what way the link was serving either as a traversible connection, a structuring means, or an argument representation."

"The great thing about the digital library craze is how much we're learning from librarians, not just how much we can teach them about technology."

Filed under: Mon, 06 Sep 2004 13:00:35 +0200
2004-08-31

Permanent link Faceted Navigation

DiamondWiki's Faceted Navigation:

"Faceted navigation lets people browse a website by using FacetedClassification to automatically generate relevant hyperlinks. If you want to see an example of faceted navigation in action, go to the BrowseFacets page and start clicking, paying attention to the categories on the left-hand side. Notice how pages can have both an "Author" and a "Subject", and you can navigate by either one. This may seem obvious to you, but the point is that pages are not restricted to a single position in one hierarchy -- this is what faceted classification is all about. It's nothing earth-shattering.

An essential part of FacetedNavigation is that the interface lets you view items that are in more than one category. In other words you can intersect two sets of items. So for example, you can view "items about diamond wiki that are authored by kim burchett", instead of being restricted to viewing "items about diamond wiki" or "items authored by kim burchett". Most hierarchical categorization systems only let you view one hierarchy at a time."

Filed under: Tue, 31 Aug 2004 00:50:01 +0200
2004-07-16

Permanent link Ontology Tools Survey, Revisited

Michael Denny - Ontology Tools Survey, Revisited:

"Reference to taxonomies and ontologies by vendors of mainstream enterprise-application-integration (EAI) solutions are becoming commonplace. Popularly tagged as semantic integration, vendors like Verity, Modulant, Unicorn, Semagix, and many more are offering platforms to interchange information among mutually heterogeneous resources including legacy databases, semi-structured repositories, industry-standard directories and vocabularies like ebXML, and streams of unstructured content as text and media. Ontologies, for example, are being used to guide the extraction of semantic content from collections of plain-text documents describing medical research, consumer products, and business topics."

Filed under: Fri, 16 Jul 2004 01:00:04 +0200
2004-06-21

Permanent link The Google PC generation

Jon Udell - The Google PC generation:

"Job No. 1 for the Google PC would be to vacuum up all available sources of data. Job No. 2 would be to exploit that data to the hilt.

On the Google PC, you wouldn’t need third-party add-ons to index and search your local files, e-mail, and instant messages. It would just happen. The voracious spider wouldn’t stop there, though. The next piece of low-hanging fruit would be the Web pages you visit. These too would be stored, indexed, and made searchable. More ambitiously, the spider would record all your screen activity along with the underlying event streams.

[...] Instead of idly slacking most of the time, our PCs ought to be indexing, analyzing, correlating, and classifying."

Filed under: Mon, 21 Jun 2004 10:42:45 +0200
2004-06-14

Permanent link Metacrap

Found Cory Doctorow's great piece on why the Semantic Web will not exist - Metacrap: Putting the torch to seven straw-men of the meta-utopia:

"2. The problems

  • 2.1 People lie
  • 2.2 People are lazy
  • 2.3 People are stupid
  • 2.4 Mission: Impossible -- know thyself
  • 2.5 Schemas aren't neutral
  • 2.6 Metrics influence results
  • 2.7 There's more than one way to describe something"
Filed under: Mon, 14 Jun 2004 10:26:33 +0200
2004-06-09

Permanent link Rhizome

"Rhizome is a Wiki-like content management and delivery system that exposes the entire site -- content, structure, and metadata as editable RDF. This means that instead of just creating a site with URLs that correspond to a page of HTML, with Rhizome you can create URLs that represent just about anything, such as:

  • structural components of content (such as a bullet point or a definition).
  • abstract entities that can be presented in different ways depending on the context.
  • relationships between entities or content, such as annotations or categories."
Filed under: Wed, 09 Jun 2004 12:36:32 +0200
2004-05-04

Permanent link Here is a How to Topic Maps, Sir!

Alexander Johannesen's essay "Here is a How to Topic Maps, Sir!":

"The truth about relational databases is that they really are Topic Maps that are trying to get out. Think about what your RDBMS is trying to do; you have a lot of tables with information bits, and you create relations between them to represent something vital to your business requirements, write SQL to mirror that and try your best at fixing a user interface on top to make it all work. The more relations you've got, the more complex your model is going to be. And for what? To create an application that that both a computer and human can handle well.

Where do you stop expanding your model and when? When it gets too complex? Too slow? Too unmaintainable? Too crazy to keep going? Too often you get bogged down in the design of models; what relations are hogs, which ones are necessary, which ones are not?"

Filed under: Tue, 04 May 2004 13:42:51 +0200

Permanent link Flamenco Search Interface

The Flamenco Search Interface: "We are creating a search interface framework, called Flamenco, whose primary design goal is to allow users to move through large information spaces in a flexible manner without feeling lost. A key property of the interface is the explicit exposure of hierarchical faceted metadata, both to guide the user toward possible choices, and to organize the results of keyword searches. The interface uses metadata in a manner that allows users to both refine and expand the current query, while maintaining a consistent representation of the collection's structure."

Filed under: Tue, 04 May 2004 00:11:03 +0200
2004-05-03

Permanent link The Brain Attic

Found an old piece written by Micah Dubinko - "The Brain Attic", where he's asking for Personal Information Management software. (He's written his own software now - using plain text files: "It's the Data, Stupid")

"What we really need is a better way for our computers to be our brain-attics, freeing us up to do whatever it is that we do best.

So, we need to be able to enter text, and shuffle existing content into the system. We also need to be able to store email and web pages and integrate with browser bookmarks. Contacts. Todo lists. Calendars. Anything that we're currently scribbling on yellow notes stuck on our monitors. And it needs to be searchable. Really quickly searchable, as in keystroke-at-a-time results.

Personal Information Managers (PIMs) have already been invented, right? Well, technically true, the late Lotus Agenda, Outlook, and Evolution being the top contenders. But something's still missing: despite these programs, people still have sticky notes, or worse, a physical desktop that looks like mine."

Filed under: Mon, 03 May 2004 12:02:03 +0200
2004-04-27

Permanent link Extreme Markup Languages papers

Looks like a great resource: The Aggregated Proceedings from the Extreme Markup Languages conferences.

Filed under: Tue, 27 Apr 2004 14:23:02 +0200
2004-04-21

Permanent link Topic Maps Visualisation

Old (May 2000), but Kent Fitch's thoughts about Topic Maps Visualisation are still interesting: "This document presents some ideas on how a topic map based structure could be browsed by an end-user. It isn't meant to define a proposed interface - just explore the benefits and drawbacks of one simple option based on the Windows File Explorer navigation paradigm."

Filed under: Wed, 21 Apr 2004 22:18:08 +0200
2004-04-13

Permanent link Metadata? Thesauri? Taxonomies? Topic Maps!

A long paper by Lars Marius Garshol: Metadata? Thesauri? Taxonomies? Topic Maps! - Making sense of it all

"To be faced with a document collection and not to be able to find the information you know exists somewhere within it is a problem as old as the existence of document collections. Information Architecture is the discipline dealing with the modern version of this problem: how to organize web sites so that users actually can find what they are looking for.

Information architects have so far applied known and well-tried tools from library science to solve this problem, and now topic maps are sailing up as another potential tool for information architects. This raises the question of how topic maps compare with the traditional solutions, and that is the question this paper attempts to address.

The paper argues that topic maps go beyond the traditional solutions in the sense that it provides a framework within which they can be represented as they are, but also extended in ways which significantly improve information retrieval."

Filed under: Tue, 13 Apr 2004 15:16:07 +0200

Permanent link Why is it so hard to lean topicmaps?

Peter Van Dijck asks: "Why is it so hard to lean topicmaps?"

"Topicmap tools are at the topicmap level, not at the real-life usefulness level. In other words, topicmap tools let you manage, create, and merge topicmaps. But they don't let you do anything specifically useful outside of the topicmap realm (like create a simple application for managing your CD's, like Access lets you do).

The level of abstraction of topicmaps is higher (which provides great Power and Flexibility) than that of databases. But that means that, to get from a topicmap to a useful application, you need more stuff in between."

Filed under: Tue, 13 Apr 2004 15:01:15 +0200

Permanent link Filling in the margins

Jon Udell on InfoWorld - "Filling in the margins":

"As I watch the students typing at the Dell PCs in the hallway, I realize that none of these kids has ever seen or used a card catalog. That's mostly a good thing. But when I joined a group of librarians on a panel last month, I was reminded that something useful has been lost: a tradition of local annotation. This isn't just old-fogy nostalgia. Librarians could talk to patrons through the medium of the library card, and although they weren't supposed to, patrons could talk back to librarians - and to one another. It was a useful back channel that online catalogs could have supported. But because it wasn't part of the official protocol, they didn't.

The fuzzy intersection of official and unofficial data has never been a comfort zone for information technologists. In chapter 4 of Klaus Kaasgaard's Software Design and Usability, Xerox's Palo Alto Research Center (PARC) alumnus Austin Henderson says that "one of the most brilliant inventions of the paper bureaucracy was the idea of the margin." There was always space for unofficial data, which traveled with the official data, and everybody knew about the relationship between the two."

Filed under: Tue, 13 Apr 2004 11:36:54 +0200
2004-03-29

Permanent link Personal knowledge management

Dave Pollard dreams about personal knowledge management:

"Highlighted on the virtual desktop are the current documents and messages that you last looked at. As it turns out, they consist of a report that you're researching, a web page that you were half finished reading, and a message that you were composing in reply to the web page. When you use the hand-cursor to 'open' these documents again, the Document Annotation Tool opens, and the hand cursor turns into a pencil cursor. The Document Annotation Tool converts each open document into a virtual piece of paper: You no longer have to concern yourself with what application was used to create the document, or what format it is in. You simply use the pencil cursor to highlight, add to, delete from, comment on, and cross-reference to other documents, exactly as you would instinctively use a real pencil to make comparable annotations on real paper. The Document Annotation Tool understands what you are doing and invisibly does the heavy lifting to translate your changes commensurately in the document's 'native' application (MS Word, HTML etc.)"

Filed under: Mon, 29 Mar 2004 15:01:56 +0200

Permanent link IPTC: Metadata for News

IPTC Metadata: Subject Reference System & NewsML Topicsets: "The IPTC creates and maintains for many years sets of terms to be assigned as metadata to news objects like text, photographs, graphics, audio- and video files and streams."

Filed under: Mon, 29 Mar 2004 12:09:42 +0200

Permanent link Metadata, Mark II

Jason Cook wrote an article on GeoURL, SMBmeta, Dublin Core, RDF and FOAF, on webmonkey:

"Well, META's not dead. In the pages that follow, I'll be giving you a bird's eye view of a few independent technologies, each aspiring to get useful metadata back into the Web. Some are homegrown, some corporate, and some academic, but all of them let you enhance your site with useful information and improve the ways your site is associated with other sites."

Filed under: Mon, 29 Mar 2004 11:42:20 +0200
2004-03-18

Permanent link Geeks and the Dijalog Lifestyle

Kendall Grant Clark starts a new column on XML.com, called "Hacking the Library":

"That's the curious space I and others like me inhabit today: Digital, but not purely digital; analog, but not only analog. We live in the space between these two, in the space carved out by their now haphazard, now principled mixture. It is a space worthy, or so I like to think, of its own name. I have taken to calling it "dijalog", that is, "digital plus analog". We're all -- at least all of us of a certain age -- dijaloggers now.

[...] Second, while geeks have lots of tools -- programming languages, data storage mechanisms, exchange formats, and global message passing systems of various kinds -- for managing their personal dijalog collections, we tend to be a bit weak on the details of ordering schemes.

In other words, we're geeks; we're not library or information scientists. But these -- computer and library science -- are kissing cousin fields, parasitic and dependent on one another in important, deep ways. Geeks can learn information and library science easily enough, but especially if they have a real, hackable motivation for doing so. I'm suggesting in this column what I intend to prove in future columns, namely, that the dijalog lifestyle, which is the one most of us are actually living, is uniquely suited to the confluence of geek hackery and certains parts of library science.

That's why I'm calling this series of columns Hacking the Library, because I want to share some of the library science tricks I've picked up in my own efforts to manage my dijalog lifestyle, and I want a motivation to learn new ones and share them with you. Thus, in the coming months, XML.com audience willing and if the creek don't rise, I'll be talking about things like

  • personal libraries as information problems, or why you need a spatial arrangement and information query scheme;
  • how to choose an ordering scheme for your media collection;
  • how to implement the Library of Congress at home;
  • how to use weblogs as a way to catalogue and categorize personal information;
  • how to use big-time metadata standards and techniques, like Dublin Core and faceted metadata, to manage dijalog artifacts;
  • how to manage non-textual artifacts like photos, videos, and music files;
  • why RDF and other Semantic Web technologies are ideal for dijalog management;
  • open source library frameworks, so you can make sure you get back the things that you lend;
  • how personal libraries can be spokes in the Digital Hub;
  • distributed collection management."
Filed under: Thu, 18 Mar 2004 09:55:19 +0100