2005-11-30
“SKOS Core
provides a model for expressing the basic structure and content of
concept schemes such as thesauri, classification schemes, subject
heading lists, taxonomies, ‘folksonomies’, other types of controlled
vocabulary, and also concept schemes embedded in glossaries and
terminologies.
The SKOS Core Vocabulary is an application of
the Resource Description Framework (RDF), that can be used to express a
concept scheme as an RDF graph.”
Filed under:
Wed, 30 Nov 2005 15:13:00 +0100
“How long will it take you to recover? Will you spend hours finding your way through log files and other IT data?
Splunk is the new way to see inside the data center. It’s search software that indexes all your fast moving IT data as it happens.”
Filed under:
Wed, 30 Nov 2005 12:35:00 +0100
2005-11-29
Chris Shiflett - The Truth about Sessions:
“This
article introduces some techniques that can reliably provide
statefulness as well as defend against session-based attacks such as
impersonation (session hijacking).”
Filed under:
Tue, 29 Nov 2005 13:49:00 +0100
“Prince
is a computer program that converts XML into PDF documents. Prince can
read many XML formats, including XHTML and SVG. Prince formats
documents according to style sheets written in CSS.”
Filed under:
Tue, 29 Nov 2005 11:46:00 +0100
2005-11-25
“Alfresco
is an open source, open-standards content repository built by the most
experienced content management team that includes the co-founder of
Documentum.”
(Take a look at the tour and the architecture diagram.)
Filed under:
Fri, 25 Nov 2005 12:18:00 +0100
2005-11-24
John Lim - The fine art of programming:
“Here’s a list of excellent online programming guides that i am compiling.”
Filed under:
Thu, 24 Nov 2005 12:12:00 +0100
2005-11-23
Derick Rethans’ minutes of the PHP Developers Meeting in Paris November 11th and 12th, 2005, planning for PHP 6 - some interesting excerpts:
1.1 Unicode on/off modes:
“[…] We also discussed whether we should even allow Unicode mode to be
turned off as current micro benchmarks show that the Unicode
implementations of some of the string functions are up to 300% slower,
and whole applications up to 25% slower. Disallowing Unicode mode to be
turned off is expected to slow down the adoption of PHP 6 too as many
ISPs would be reluctant to install a version that immediately slows
down the applications of their users.”
2.1 register_globals: “[…] We are going to remove the functionality.”
2.2 magic_quotes: “[…] We remove the magic_quotes feature from PHP.”
3.5 Fileinfo extension in the distribution:
“[…] The mime_magic extension doesn’t work very well, and there is an
extension in PECL (Fileinfo). We suggest to include this extension into
the core, and enable it by default as MIME-type detection is something
that most web applications need.”
4.5 Cleanup for {} vs. []:
“[…] 1. We will undeprecate [] for accessing characters in strings. 2.
{} will be deprecated in PHP 5.1.0 with an E_STRICT and removed in PHP
6. 3. For both strings and arrays, the [] operator will support
substr()/array_slice() functionality.”
6.1 Add an opcode cache to the distribution (APC): “[…] 1. We include APC in the core distributions 2. APC will not be turned on by default.”
6.6 E_STRICT on by default:
“[…] As we want to expose the language level warnings a bit more, and
because of having all error levels in E_ALL, except E_STRICT is
confusing we will be adding E_STRICT to E_ALL. As the current default
is E_ALL & ~E_NOTICE we will effectively turn on E_STRICT by
default.”
Filed under:
Wed, 23 Nov 2005 11:46:00 +0100
Matt Rand at Forbes.com - Open Source Invades the Enterprise:
“Pfizer
recently embarked on a $25,000 pilot program, where it set up an
open-source LAMP architecture next to BEA’s Weblogic J2EE software. It
used each software framework to build an application that pulled data
from an Oracle database. What surprised Pfizer’s Martin Brodbeck, the
director of architecture for the company’s Global Pharmaceutical Group,
was that the LAMP software cut development time dramatically over
J2EE.”
Filed under:
Wed, 23 Nov 2005 10:27:00 +0100
2005-11-22
Michael Nash at developer.com - Decoupling Application Logic, Persistence, and Flow: The Model Technique:
“The
next step beyond separation of persistence and business logic can be
the separation of the application control flow. Business logic classes
in this case are written in such a way that they are unaware of how
they were called, or what business logic element will be called next.
They still require certain inputs, of course, and produce results in
some fashion (again, often using the bean pattern to allow result
properties to be accessed), but they are a single link in a chain. Some
external mechanism is used to control application flow, either another
class, or a driver class that reads the sequencing and navigation
information from configuration.
[…] Long-time users of
Unix-style operating systems will be familiar with the pattern
described here, as it is a lot like the Unix command philosophy: Keep
each command simple, make it do one thing and do it well, and provide a
powerful means to assemble multiple commands into complex applications.
In the Unix world, this is achieved by shell scripts and the pipeline
technique. The same ideas can be applied to Java applications, with
similar powerful results.
Many developers, of course, will
recognize this technique as the beginnings of a full workflow pattern,
where application logic steps can be combined in sequences or “flows”
as required, and where the decisions at each step as to what the next
step should be (or what the choices for next steps are, if there are
several), are in fact left up to the workflow engine, driven by a
sophisticated configuration file. This configuration file can even in
many cases be created graphically, allowing a developer to literally
draw the sequence of operations of the application required, drawing
more and more from a pool of re-usable business logic components, and
inventing each individual wheel only once, instead of repeatedly.”
Filed under:
Tue, 22 Nov 2005 16:57:00 +0100
2005-11-18
Kevin Farnham at ONLamp.com - Analyzing Statistics with GNU R:
“Even for people who aren’t expert statisticians, the power of R
is alluring. Working interactively or using an R script, with just a
few lines of code a user can perform complex analyses of large data
sets, produce graphics depicting the features and structure of the
data, and perform statistical analyses that can quickly answer
questions about the data. This article introduces R and demonstrates a
small slice of its capabilities, using data from the stock market and
real estate industry as input.”
Filed under:
Fri, 18 Nov 2005 11:21:00 +0100
2005-11-17
Simon Willison - Google Base is interesting:
“Base
is a very interesting product for a whole bunch of reasons. The data
model is surprisingly simple on the surface: all items have a title,
description, (optional) external URL, a “type” and a set of labels
(a.k.a. tags) and “attributes". Attributes are something for tag
enthusiasts to get excited by - they’re name/value pairs that are kind
of like tags in that you can apply them to anything, but more
structured and with a greater level of implied meaning.
[…]
There’s definitely a trend towards this kind of loose data model at the
moment. JotSpot allows all pages within a wiki to have as many extra
name/value attribute pairs as you like (even the wiki body itself is
internally implemented as a special attribute), and Ning works along
similar lines.”
Filed under:
Thu, 17 Nov 2005 13:19:00 +0100
“Django is a high-level Python Web framework that encourages rapid development and clean, pragmatic design.
Developed
and used over the past two years by a fast-moving online-news
operation, Django was designed from scratch to handle two challenges:
the intensive deadlines of a newsroom and the stringent requirements of
experienced Web developers. It has convenient niceties for developing
content-management systems, but it’s an excellent tool for building any
Web site.
Django focuses on automating as much as possible and adhering to the DRY principle.”
Filed under:
Thu, 17 Nov 2005 13:16:00 +0100
Ben Meyer - Grandma’s don’t use computers:
“So
who should software developers target for their easy of use testing?
Who can’t wont put up with software that is hard to use? Middle aged
men that have kids or commonly referred to as “dads". They are young
enough to know about the latest technology, old enough to have money to
buy them, but don’t have time for applications that don’t just work.
Because
there are kids in the house dads have very little time. They don’t have
time to try out every single option in a program or tweak their system
like a twenty two year old collage student can. They just want things
to work on the first try.”
Filed under:
Thu, 17 Nov 2005 10:14:00 +0100
2005-11-16
“Dabble
combines the best of group spreadsheets, custom databases, and intranet
web applications into a new way to manage and share your information
online.”
Filed under:
Wed, 16 Nov 2005 23:38:00 +0100
“Google Base
is a place where you can easily submit all types of online and offline
content that we’ll host and make searchable online. You can describe
any item you post with attributes, which will help people find it when
they search Google Base. In fact, based on the relevance of your items,
they may also be included in the main Google search index and other
Google products like Froogle, Google Base and Google Local.”
Filed under:
Wed, 16 Nov 2005 11:29:00 +0100
2005-11-08
Richard Davey - The Good, the Bad and the Ugly:
“The
ease of developing with PHP has lead to the creation of this script
gold mine, and while it can be a wonder to explore there are many
factors you should take into consideration before going on a
downloading frenzy.
Will the script you are about to install
bring your server to a halt? Does it open up glaring security issues?
Is there a mess of complex and poorly written code behind the sleek
HTML exterior? In this article we get our pick axes ready, delve deep
and bring back examples that serve one purpose: to show you what to
look out for in other peoples code. By looking at good, bad and just
downright ugly snippets of code you can gain a far better understanding
of the overall quality of a PHP script.”
Filed under:
Tue, 08 Nov 2005 23:59:00 +0100
2005-11-07
“Wikidata
is a proposed wiki-like database for various types of content. This
project as proposed here requires significant changes to the software
(or possibly a completely new software) but has the potential to
centrally store and manage data from all Wikimedia projects, and to
radically expand the range of content that can be built using wiki
principles.”
Filed under:
Mon, 07 Nov 2005 22:30:00 +0100
2005-11-05
chromatic - Maybe it’s Not Just Ruby on Rails:
“In
my mind, the issue isn’t “Ruby on Rails is more flexible and capable
than standard J2EE or .NET for any project under a (very high)
threshold of complexity". The real point is that the simplicity,
flexibility, and abstraction possibilities offered by dynamic languages
and well-designed libraries – as well as a talent for exploiting
radical simplicity, extracting commonalities from actual working code,
and knowing when too much flexibility makes you less agile – offer a
huge advantage over languages and libraries and frameworks and
platforms that assume you need a lot of hand-holding to solve a really
hard problem.
Yes, Ruby on Rails does what it does very well.
It’s not the only thing that does, though. I wonder perhaps if some of
the buzz and glow is that it’s new and shiny (in comparison), so that
people haven’t already formed their own opinions about it.”
Filed under:
Sat, 05 Nov 2005 23:55:00 +0100
2005-11-04
Paul Prescod - Reinventing Email using REST:
“As
an educational tool, this article will describe how to re-engineer a
familiar application, email, as a Web Service using HTTP and the
principles of Web Architecture and REpresentational State Transfer.”
Filed under:
Fri, 04 Nov 2005 14:08:00 +0100
“Selenium
is a test tool for web applications. Selenium tests run directly in a
browsers, just as real users do. And they run in Internet Explorer,
Mozilla and Firefox on Windows, Linux and Macintosh. […] Installed with
your application webserver, Selenium automatically deploys it’s
JavaScript automation engine – the Browser Bot – to your browser when
you point it at the Selenium install point on your webserver.”
Filed under:
Fri, 04 Nov 2005 11:01:00 +0100
Tim Bray - Wikipedia Notes:
“As
of today, there are 124 servers, fairly heterogeneous, although these
days they’ve pretty well standardized on dual-Opteron boxes. The
MediaWiki software is PHP-based, mostly running on Fedora; I wonder if
this is the world’s largest-scale PHP deployment, or would Yahoo top
that? They get a pretty good hit rate on their Squid caches; basically,
the whole system scales about linearly with the number of servers they
deploy.
[…] Having said that, speaking both as a user and
contributor, I find that Wikipedia’s performance is mostly pretty
terrible; usable, but irritatingly slow. So there’s certainly room for
improvement.”
Filed under:
Fri, 04 Nov 2005 10:36:00 +0100
2005-11-03
Oliver Rist at InfoWorld - Preview: Windows Workflow Foundation:
“WWF
creates a class of application that is rarely seen except when created
through extraordinary effort: A distributed user-facing application.
From
a developer’s perspective, WWF is a toolbox of abstractions for
workflow-related activities such as receiving and sending Web services
calls, taking conditional branches from an otherwise sequential
workflow during the course of its execution, firing and sinking
(receiving) asynchronous events and managing nested workflows. WWF
abstracts and extends familiar paradigms in ways that change how
developers think.”
Filed under:
Thu, 03 Nov 2005 16:49:00 +0100
John Allspaw (Flickr) has nice presentation slides titled Hardware Layouts for LAMP Installations [Powerpoint], talking about hardware requirements, MySQL load balancing and caching for large-scale LAMP installations.
Filed under:
Thu, 03 Nov 2005 13:55:00 +0100
2005-11-02
Harry Fuecks at SitePoint - Oracle 10g XE and PHP:
“In
case you missed it, yesterday Oracle announced a free (as in beer)
version of their database - Oracle 10g Express Edition (XE) - basically
a ‘lite’ version - some industry analysis here. Significance of this
move aside, more interesting is having a play. Managed to get the
equivalent of a ‘Hello World’ from PHP to Oracle up in under 1.5 hours
today (ran into a specific glitch that required a re-install otherwise
would have been less time). Here’s how…”
Filed under:
Wed, 02 Nov 2005 11:07:00 +0100
Karl Fogel: “Producing Open Source Software
is a book about the human side of open source development. It describes
how successful projects operate, the expectations of users and
developers, and the culture of free software.
Producing Open Source Software is available in bookstores, and you can browse or download it here.”
Filed under:
Wed, 02 Nov 2005 10:58:00 +0100
2005-11-01
Adam Bosworth at ACM Queue - Learning from THE WEB:
“Successful
systems on the Web are bottom-up. They don’t mandate much in a top-down
way. Instead, they control themselves through tipping points. For
example, Flickr doesn’t tell its users what tags to use for photos. Far
from it. Any user can tag any photo with anything (well, I don’t think
you can use spaces). But, and this is a key but, Flickr does provide
feedback about the most popular tags, and people seeking attention for
their photos, or photos that they like, quickly learn to use that
lexicon if it makes sense. It turns out to be amazingly stable.
[…]
It is time that the database vendors stepped up to the plate and
started to support a native RSS 2.0/Atom protocol and wire format; a
simple way to ask very general queries; a way to model data that
encompasses trees and arbitrary graphs in ways that humans think about
them; far more fluid schemas that don’t require complex joins to model
variations on a theme about anything from products to people to places;
and built-in linear scaling so that the database salespeople can tell
their customers, in good conscience, for this class of queries you can
scale arbitrarily with regard to throughput and extremely well even
with regard to latency, as long as you limit yourself to the following
types of queries. Then we will know that the database vendors have
joined the 21st century.”
Filed under:
Tue, 01 Nov 2005 22:10:00 +0100