Kategorie: Tim’s Weblog

  • Amberfish

    „Amberfish is general purpose text retrieval software. Its distinguishing features are indexing/search of semi-structured text (i.e. both free text and multiply nested fields), built-in support for XML documents using the Xerces library, structured queries allowing generalized field/tag paths, hierarchical result sets (XML only), automatic searching across multiple databases (allowing modular indexing), and relatively low memory…

  • Flamenco Search Interface

    The Flamenco Search Interface: „We are creating a search interface framework, called Flamenco, whose primary design goal is to allow users to move through large information spaces in a flexible manner without feeling lost. A key property of the interface is the explicit exposure of hierarchical faceted metadata, both to guide the user toward possible…

  • Here is a How to Topic Maps, Sir!

    Alexander Johannesen’s essay „Here is a How to Topic Maps, Sir!“: „The truth about relational databases is that they really are Topic Maps that are trying to get out. Think about what your RDBMS is trying to do; you have a lot of tables with information bits, and you create relations between them to represent…

  • Bayesian classification using Rainbow

    Fascinating stuff: „Rainbow is a program that performs statistical text classification.“ It can use Bayesian classification to automatically categorize documents. Jon Udell tried it out last year: “ There’s been some discussion in the blog world about using a Bayesian categorizer to enable a person to discriminate along various interest/non-interest axes. I took a run…

  • The Brain Attic

    Found an old piece written by Micah Dubinko – „The Brain Attic“, where he’s asking for Personal Information Management software. (He’s written his own software now – using plain text files: „It’s the Data, Stupid“) „What we really need is a better way for our computers to be our brain-attics, freeing us up to do…