2004-05-27

The State of Workflow

Tom Baeyens on The State of Workflow:

"When talking about an RDBMS in a software development team most people will get the picture and shake their heads slightly up and down confirming they understand what you're saying. When using workflow terminology, the same crowds will shake their heads similarly but this time, every person will have a very different picture."

"[...] I really believe a WFMS [workflow management system] is the missing link in enterprise development. First of all, I want to state that the default way of incorporating business process logic into enterprise software is a scattered approach. This means that the logic of the business process is spread over various systems like the EJB-container, database-triggers, message broker and the likes. No need to mention that this leads to unmaintainable software. As a consequence, these organizations will think of changing their business process software only as a last resort. They often prefer to change their processes to the software. This argument also applies to a larger extent to ERP software packages. In a second attempt, suppose we are aware of this problem and really want to centralize all code related to one process. This is fine for one business process, but if you implement multiple processes, you see duplication of code that manages state and process variables. Then, in a third approach, we could extract the duplicated code and put it in a central library. Well, that is actually a WFMS so don't bother implementing it yourself and choose one from the list below."

Thu, 27 May 2004 11:44:28 +0000
2004-05-25

hiercat

"hiercat is an automatic text classifier which uses the hierarchical structure of class labels to improve classification performance. The model it uses is that of Gaussier, et. al."

Tue, 25 May 2004 11:55:54 +0000

Return-codes vs. Exceptions, Part 129

Doug Ross on Return-codes vs. Exceptions:

"Software quality, in general, sucks. The reason for this is that many developers are too lazy to instrument, monitor and and respond to all sorts of strange conditions.

In other words, many of us are undisciplined. We're more worried about "readability" (and I disagree with that contention as well - but I'll get to that) than whether or not or software will kill anyone, debit the wrong account by a million bucks, or screw up the actuarial table for 83 year-old transvestites.

Like it or not, dealing with aberrant conditions is a contract we agree to when we decide to be professional and responsible software developers. Treating "abnormal behaviour inline with normal execution" is a misstatement. We need to deal with the unexpected. And the best place to do it is the place where you can "unwind" the logic that's gone bad."

Tue, 25 May 2004 07:29:51 +0000
2004-05-24

Scaling Oracle and PHP

George Schlossnagle on Scaling Oracle and PHP: "Learn generic techniques and designs for writing manageable, scalable, and fast PHP code that directly relate to using Oracle Database."

Mon, 24 May 2004 14:01:35 +0000
2004-05-17

What WordPress Does Right

Lauren Wood - What WordPress Does Right:

"So here's Lauren's Product Management 101, using WordPress as the example.

  • It's easy to find out what the software does
  • It does what it claims to
  • It looks like people still work on it
  • There’s some hope of getting help with problems"
Mon, 17 May 2004 05:22:24 +0000
2004-05-13

Taste for Makers

Paul Graham essay on Taste for Makers:

"Relativism is fashionable at the moment, and that may hamper you from thinking about taste, even as yours grows. But if you come out of the closet and admit, at least to yourself, that there is such a thing as good and bad design, then you can start to study good design in detail."

"Intolerance for ugliness is not in itself enough. You have to understand a field well before you develop a good nose for what needs fixing. You have to do your homework. But as you become expert in a field, you'll start to hear little voices saying, What a hack! There must be a better way. Don't ignore those voices. Cultivate them. The recipe for great work is: very exacting taste, plus the ability to gratify it."

Thu, 13 May 2004 08:25:23 +0000
2004-05-07

Workflow versus Process Automation

Martin Roberts on Workflow versus Process Automation: "When a Process fails where do you need to route the fault to? Normally a human - so why do most tools make this a cumbersome task? Why do these so called next generation tools find dealing with people such an alien idea? I believe the answer lies in the fact that most of these emerging tools have been built by people used to handling classes that rarely touch humans directly. They tend to be focused on the J2EE/.Net like frameworks which are low level in the inspirations and have failed to take into account the gains of the 4GL world of the early 1990's."

Jon Udell comments: "I met Martin at XML 2003 and we had a fascinating hour-long conversation. The point he makes here -- that humans are the exception handlers in automated systems, and that we need to design accordingly -- is one I've made too."

Fri, 07 May 2004 09:47:33 +0000
2004-05-06

The Sum of Ant

Ken Arnold on the The Sum of Ant:

"Ant is nothing more than the sum of its parts

By this I mean that ant has not learned the basic power of composition, building things out smaller parts. This was the great insight that our Unix forbearers bequeathed us toolsmiths, and it's pretty sad to see it forgotten. In Unix this is done with pipes -- the output of one program can become the input of another. Along with conventions about program output, this allows you to build up something that is more than the sum of its parts. A utility that finds file in the file system can generate a list of files for any purpose: removal, editing, rebuilding, copying, printing, ...

Ant has many kinds of tasks. These are portable because they are written in Java, and these task implementations can rely upon operating system abstractions in the underlying ant platform.

But they almost never work together. There are some common tools for building up lists of files, but if you want a list of files from any other source, good luck. This is why there are two different tasks (one optional) that calculate list of Java class dependencies, and they are incompatible. And if you want a list of Java class dependencies for a task that one of these two tasks can't handle, good luck.

A common quote in our biz is the observation that a good software tool will do things the author never expected. I doubt that writers of ant tasks are very often surprised."

Thu, 06 May 2004 20:54:36 +0000

Xapian

"Xapian is an Open Source Probabilistic Information Retrieval library, released under the GPL. It's written in C , and bindings are under development to allow use from other languages (Perl, Python, and PHP are working; Java will be available shortly).

Xapian is designed to be a highly adaptable toolkit to allow developers to easily add advanced indexing and search facilities to their own applications."

Thu, 06 May 2004 15:16:30 +0000

Why is Distributed so Hard?

Dale Asberry on the importancy of loose coupling - Why is Distributed so Hard?:

"In some ways, marriage vows do a disservice to the richer subtleties in intimate human interaction. Namely, two people don't come together to become one, they come together to become three! There will always be the self and the other. The third ingredient is the relationship itself. Unless the relationship is tended to by both people, it won't work. The most successful couples:

* address expectations through rituals (protocols) and communication * adjust to the situation * are forgiving of transgressions * interdependent, neither independent nor dependent"

Thu, 06 May 2004 07:52:25 +0000
2004-05-05

Amberfish

"Amberfish is general purpose text retrieval software. Its distinguishing features are indexing/search of semi-structured text (i.e. both free text and multiply nested fields), built-in support for XML documents using the Xerces library, structured queries allowing generalized field/tag paths, hierarchical result sets (XML only), automatic searching across multiple databases (allowing modular indexing), and relatively low memory requirements during indexing (and the ability to index documents larger than available memory). Other features include standard Boolean queries, right truncation, phrase searching, relevance ranking, support for multiple documents per file, and easy integration with other UNIX tools. The software architecture is also designed to permit proximity queries and incremental indexing; however, they are not fully implemented at present."

Wed, 05 May 2004 12:50:36 +0000
2004-05-04

Bayesian classification using Rainbow

Fascinating stuff: "Rainbow is a program that performs statistical text classification." It can use Bayesian classification to automatically categorize documents.

Jon Udell tried it out last year: " There's been some discussion in the blog world about using a Bayesian categorizer to enable a person to discriminate along various interest/non-interest axes. I took a run at this recently and, although my experiments haven't been wildly successful, I want to report them because I think the idea may have merit."

Tue, 04 May 2004 17:24:48 +0000

Here is a How to Topic Maps, Sir!

Alexander Johannesen's essay "Here is a How to Topic Maps, Sir!":

"The truth about relational databases is that they really are Topic Maps that are trying to get out. Think about what your RDBMS is trying to do; you have a lot of tables with information bits, and you create relations between them to represent something vital to your business requirements, write SQL to mirror that and try your best at fixing a user interface on top to make it all work. The more relations you've got, the more complex your model is going to be. And for what? To create an application that that both a computer and human can handle well.

Where do you stop expanding your model and when? When it gets too complex? Too slow? Too unmaintainable? Too crazy to keep going? Too often you get bogged down in the design of models; what relations are hogs, which ones are necessary, which ones are not?"

Tue, 04 May 2004 11:42:51 +0000
2004-05-03

Flamenco Search Interface

The Flamenco Search Interface: "We are creating a search interface framework, called Flamenco, whose primary design goal is to allow users to move through large information spaces in a flexible manner without feeling lost. A key property of the interface is the explicit exposure of hierarchical faceted metadata, both to guide the user toward possible choices, and to organize the results of keyword searches. The interface uses metadata in a manner that allows users to both refine and expand the current query, while maintaining a consistent representation of the collection's structure."

Mon, 03 May 2004 22:11:03 +0000

The Brain Attic

Found an old piece written by Micah Dubinko - "The Brain Attic", where he's asking for Personal Information Management software. (He's written his own software now - using plain text files: "It's the Data, Stupid")

"What we really need is a better way for our computers to be our brain-attics, freeing us up to do whatever it is that we do best.

So, we need to be able to enter text, and shuffle existing content into the system. We also need to be able to store email and web pages and integrate with browser bookmarks. Contacts. Todo lists. Calendars. Anything that we're currently scribbling on yellow notes stuck on our monitors. And it needs to be searchable. Really quickly searchable, as in keystroke-at-a-time results.

Personal Information Managers (PIMs) have already been invented, right? Well, technically true, the late Lotus Agenda, Outlook, and Evolution being the top contenders. But something's still missing: despite these programs, people still have sticky notes, or worse, a physical desktop that looks like mine."

Mon, 03 May 2004 10:02:03 +0000