2005-11-30

SKOS Core

SKOS Core provides a model for expressing the basic structure and content of concept schemes such as thesauri, classification schemes, subject heading lists, taxonomies, ‘folksonomies’, other types of controlled vocabulary, and also concept schemes embedded in glossaries and terminologies.

The SKOS Core Vocabulary is an application of the Resource Description Framework (RDF), that can be used to express a concept scheme as an RDF graph.”

Wed, 30 Nov 2005 14:13:00 +0000

Splunk

“How long will it take you to recover? Will you spend hours finding your way through log files and other IT data?

Splunk is the new way to see inside the data center. It’s search software that indexes all your fast moving IT data as it happens.”

Wed, 30 Nov 2005 11:35:00 +0000
2005-11-29

The Truth about Sessions

Chris Shiflett - The Truth about Sessions:

“This article introduces some techniques that can reliably provide statefulness as well as defend against session-based attacks such as impersonation (session hijacking).”

Tue, 29 Nov 2005 12:49:00 +0000

Prince

Prince is a computer program that converts XML into PDF documents. Prince can read many XML formats, including XHTML and SVG. Prince formats documents according to style sheets written in CSS.”

Tue, 29 Nov 2005 10:46:00 +0000
2005-11-25

Alfresco

Alfresco is an open source, open-standards content repository built by the most experienced content management team that includes the co-founder of Documentum.”
(Take a look at the tour and the architecture diagram.)

Fri, 25 Nov 2005 11:18:00 +0000
2005-11-24

The fine art of programming

John Lim - The fine art of programming:

“Here’s a list of excellent online programming guides that i am compiling.”

Thu, 24 Nov 2005 11:12:00 +0000
2005-11-23

Minutes PHP Developers Meeting

Derick Rethans’ minutes of the PHP Developers Meeting in Paris November 11th and 12th, 2005, planning for PHP 6 - some interesting excerpts:

1.1 Unicode on/off modes: “[…] We also discussed whether we should even allow Unicode mode to be turned off as current micro benchmarks show that the Unicode implementations of some of the string functions are up to 300% slower, and whole applications up to 25% slower. Disallowing Unicode mode to be turned off is expected to slow down the adoption of PHP 6 too as many ISPs would be reluctant to install a version that immediately slows down the applications of their users.”

2.1 register_globals: “[…] We are going to remove the functionality.”

2.2 magic_quotes: “[…] We remove the magic_quotes feature from PHP.”

3.5 Fileinfo extension in the distribution: “[…] The mime_magic extension doesn’t work very well, and there is an extension in PECL (Fileinfo). We suggest to include this extension into the core, and enable it by default as MIME-type detection is something that most web applications need.”

4.5 Cleanup for {} vs. []: “[…] 1. We will undeprecate [] for accessing characters in strings. 2. {} will be deprecated in PHP 5.1.0 with an E_STRICT and removed in PHP 6. 3. For both strings and arrays, the [] operator will support substr()/array_slice() functionality.”

6.1 Add an opcode cache to the distribution (APC): “[…] 1. We include APC in the core distributions 2. APC will not be turned on by default.”

6.6 E_STRICT on by default: “[…] As we want to expose the language level warnings a bit more, and because of having all error levels in E_ALL, except E_STRICT is confusing we will be adding E_STRICT to E_ALL. As the current default is E_ALL & ~E_NOTICE we will effectively turn on E_STRICT by default.”

Wed, 23 Nov 2005 10:46:00 +0000

Open Source Invades the Enterprise

Matt Rand at Forbes.com - Open Source Invades the Enterprise:

“Pfizer recently embarked on a $25,000 pilot program, where it set up an open-source LAMP architecture next to BEA’s Weblogic J2EE software. It used each software framework to build an application that pulled data from an Oracle database. What surprised Pfizer’s Martin Brodbeck, the director of architecture for the company’s Global Pharmaceutical Group, was that the LAMP software cut development time dramatically over J2EE.”

Wed, 23 Nov 2005 09:27:00 +0000
2005-11-22

Decoupling Application Logic, Persistence, and Flow: The Model Technique

Michael Nash at developer.com - Decoupling Application Logic, Persistence, and Flow: The Model Technique:

“The next step beyond separation of persistence and business logic can be the separation of the application control flow. Business logic classes in this case are written in such a way that they are unaware of how they were called, or what business logic element will be called next. They still require certain inputs, of course, and produce results in some fashion (again, often using the bean pattern to allow result properties to be accessed), but they are a single link in a chain. Some external mechanism is used to control application flow, either another class, or a driver class that reads the sequencing and navigation information from configuration.

[…] Long-time users of Unix-style operating systems will be familiar with the pattern described here, as it is a lot like the Unix command philosophy: Keep each command simple, make it do one thing and do it well, and provide a powerful means to assemble multiple commands into complex applications. In the Unix world, this is achieved by shell scripts and the pipeline technique. The same ideas can be applied to Java applications, with similar powerful results.

Many developers, of course, will recognize this technique as the beginnings of a full workflow pattern, where application logic steps can be combined in sequences or “flows” as required, and where the decisions at each step as to what the next step should be (or what the choices for next steps are, if there are several), are in fact left up to the workflow engine, driven by a sophisticated configuration file. This configuration file can even in many cases be created graphically, allowing a developer to literally draw the sequence of operations of the application required, drawing more and more from a pool of re-usable business logic components, and inventing each individual wheel only once, instead of repeatedly.”

Tue, 22 Nov 2005 15:57:00 +0000
2005-11-18

Analyzing Statistics with GNU R

Kevin Farnham at ONLamp.com - Analyzing Statistics with GNU R:

“Even for people who aren’t expert statisticians, the power of R is alluring. Working interactively or using an R script, with just a few lines of code a user can perform complex analyses of large data sets, produce graphics depicting the features and structure of the data, and perform statistical analyses that can quickly answer questions about the data. This article introduces R and demonstrates a small slice of its capabilities, using data from the stock market and real estate industry as input.”

Fri, 18 Nov 2005 10:21:00 +0000
2005-11-17

Google Base is interesting

Simon Willison - Google Base is interesting:

“Base is a very interesting product for a whole bunch of reasons. The data model is surprisingly simple on the surface: all items have a title, description, (optional) external URL, a “type” and a set of labels (a.k.a. tags) and “attributes". Attributes are something for tag enthusiasts to get excited by - they’re name/value pairs that are kind of like tags in that you can apply them to anything, but more structured and with a greater level of implied meaning.

[…] There’s definitely a trend towards this kind of loose data model at the moment. JotSpot allows all pages within a wiki to have as many extra name/value attribute pairs as you like (even the wiki body itself is internally implemented as a special attribute), and Ning works along similar lines.”

Thu, 17 Nov 2005 12:19:00 +0000

Django

Django is a high-level Python Web framework that encourages rapid development and clean, pragmatic design.

Developed and used over the past two years by a fast-moving online-news operation, Django was designed from scratch to handle two challenges: the intensive deadlines of a newsroom and the stringent requirements of experienced Web developers. It has convenient niceties for developing content-management systems, but it’s an excellent tool for building any Web site.

Django focuses on automating as much as possible and adhering to the DRY principle.”

Thu, 17 Nov 2005 12:16:00 +0000

Grandma’s don’t use computers

Ben Meyer - Grandma’s don’t use computers:

“So who should software developers target for their easy of use testing? Who can’t wont put up with software that is hard to use? Middle aged men that have kids or commonly referred to as “dads". They are young enough to know about the latest technology, old enough to have money to buy them, but don’t have time for applications that don’t just work.

Because there are kids in the house dads have very little time. They don’t have time to try out every single option in a program or tweak their system like a twenty two year old collage student can. They just want things to work on the first try.”

Thu, 17 Nov 2005 09:14:00 +0000
2005-11-16

Dabble DB

Dabble combines the best of group spreadsheets, custom databases, and intranet web applications into a new way to manage and share your information online.”

Wed, 16 Nov 2005 22:38:00 +0000

Google Base

Google Base is a place where you can easily submit all types of online and offline content that we’ll host and make searchable online. You can describe any item you post with attributes, which will help people find it when they search Google Base. In fact, based on the relevance of your items, they may also be included in the main Google search index and other Google products like Froogle, Google Base and Google Local.”

Wed, 16 Nov 2005 10:29:00 +0000
2005-11-08

The Good, the Bad and the Ugly

Richard Davey - The Good, the Bad and the Ugly:

“The ease of developing with PHP has lead to the creation of this script gold mine, and while it can be a wonder to explore there are many factors you should take into consideration before going on a downloading frenzy.

Will the script you are about to install bring your server to a halt? Does it open up glaring security issues? Is there a mess of complex and poorly written code behind the sleek HTML exterior? In this article we get our pick axes ready, delve deep and bring back examples that serve one purpose: to show you what to look out for in other peoples code. By looking at good, bad and just downright ugly snippets of code you can gain a far better understanding of the overall quality of a PHP script.”

Tue, 08 Nov 2005 22:59:00 +0000
2005-11-07

Wikidata

Wikidata is a proposed wiki-like database for various types of content. This project as proposed here requires significant changes to the software (or possibly a completely new software) but has the potential to centrally store and manage data from all Wikimedia projects, and to radically expand the range of content that can be built using wiki principles.”

Mon, 07 Nov 2005 21:30:00 +0000
2005-11-05

Maybe it’s Not Just Ruby on Rails

chromatic - Maybe it’s Not Just Ruby on Rails:

“In my mind, the issue isn’t “Ruby on Rails is more flexible and capable than standard J2EE or .NET for any project under a (very high) threshold of complexity". The real point is that the simplicity, flexibility, and abstraction possibilities offered by dynamic languages and well-designed libraries – as well as a talent for exploiting radical simplicity, extracting commonalities from actual working code, and knowing when too much flexibility makes you less agile – offer a huge advantage over languages and libraries and frameworks and platforms that assume you need a lot of hand-holding to solve a really hard problem.

Yes, Ruby on Rails does what it does very well. It’s not the only thing that does, though. I wonder perhaps if some of the buzz and glow is that it’s new and shiny (in comparison), so that people haven’t already formed their own opinions about it.”

Sat, 05 Nov 2005 22:55:00 +0000
2005-11-04

Reinventing Email using REST

Paul Prescod - Reinventing Email using REST:

“As an educational tool, this article will describe how to re-engineer a familiar application, email, as a Web Service using HTTP and the principles of Web Architecture and REpresentational State Transfer.”

Fri, 04 Nov 2005 13:08:00 +0000

Selenium

Selenium is a test tool for web applications. Selenium tests run directly in a browsers, just as real users do. And they run in Internet Explorer, Mozilla and Firefox on Windows, Linux and Macintosh. […] Installed with your application webserver, Selenium automatically deploys it’s JavaScript automation engine – the Browser Bot – to your browser when you point it at the Selenium install point on your webserver.”

Fri, 04 Nov 2005 10:01:00 +0000

Wikipedia Notes

Tim Bray - Wikipedia Notes:

“As of today, there are 124 servers, fairly heterogeneous, although these days they’ve pretty well standardized on dual-Opteron boxes. The MediaWiki software is PHP-based, mostly running on Fedora; I wonder if this is the world’s largest-scale PHP deployment, or would Yahoo top that? They get a pretty good hit rate on their Squid caches; basically, the whole system scales about linearly with the number of servers they deploy.

[…] Having said that, speaking both as a user and contributor, I find that Wikipedia’s performance is mostly pretty terrible; usable, but irritatingly slow. So there’s certainly room for improvement.”

Fri, 04 Nov 2005 09:36:00 +0000
2005-11-03

Preview: Windows Workflow Foundation

Oliver Rist at InfoWorld - Preview: Windows Workflow Foundation:

“WWF creates a class of application that is rarely seen except when created through extraordinary effort: A distributed user-facing application.

From a developer’s perspective, WWF is a toolbox of abstractions for workflow-related activities such as receiving and sending Web services calls, taking conditional branches from an otherwise sequential workflow during the course of its execution, firing and sinking (receiving) asynchronous events and managing nested workflows. WWF abstracts and extends familiar paradigms in ways that change how developers think.”

Thu, 03 Nov 2005 15:49:00 +0000

Hardware Layouts for LAMP Installations

John Allspaw (Flickr) has nice presentation slides titled Hardware Layouts for LAMP Installations [Powerpoint], talking about hardware requirements, MySQL load balancing and caching for large-scale LAMP installations.

Thu, 03 Nov 2005 12:55:00 +0000
2005-11-02

Oracle 10g XE and PHP

Harry Fuecks at SitePoint - Oracle 10g XE and PHP:

“In case you missed it, yesterday Oracle announced a free (as in beer) version of their database - Oracle 10g Express Edition (XE) - basically a ‘lite’ version - some industry analysis here. Significance of this move aside, more interesting is having a play. Managed to get the equivalent of a ‘Hello World’ from PHP to Oracle up in under 1.5 hours today (ran into a specific glitch that required a re-install otherwise would have been less time). Here’s how…”

Wed, 02 Nov 2005 10:07:00 +0000

Producing Open Source Software

Karl Fogel: “Producing Open Source Software is a book about the human side of open source development. It describes how successful projects operate, the expectations of users and developers, and the culture of free software.

Producing Open Source Software is available in bookstores, and you can browse or download it here.”

Wed, 02 Nov 2005 09:58:00 +0000
2005-11-01

Learning from THE WEB

Adam Bosworth at ACM Queue - Learning from THE WEB:

“Successful systems on the Web are bottom-up. They don’t mandate much in a top-down way. Instead, they control themselves through tipping points. For example, Flickr doesn’t tell its users what tags to use for photos. Far from it. Any user can tag any photo with anything (well, I don’t think you can use spaces). But, and this is a key but, Flickr does provide feedback about the most popular tags, and people seeking attention for their photos, or photos that they like, quickly learn to use that lexicon if it makes sense. It turns out to be amazingly stable.

[…] It is time that the database vendors stepped up to the plate and started to support a native RSS 2.0/Atom protocol and wire format; a simple way to ask very general queries; a way to model data that encompasses trees and arbitrary graphs in ways that humans think about them; far more fluid schemas that don’t require complex joins to model variations on a theme about anything from products to people to places; and built-in linear scaling so that the database salespeople can tell their customers, in good conscience, for this class of queries you can scale arbitrarily with regard to throughput and extremely well even with regard to latency, as long as you limit yourself to the following types of queries. Then we will know that the database vendors have joined the 21st century.”

Tue, 01 Nov 2005 21:10:00 +0000