Tim’s Weblog Tim's Weblog
Tim Strehle’s links and thoughts on Web apps, managing software development and Digital Asset Management, since 2002.

On the Goodness of Unicode

Tim Bray On the Goodness of Unicode:

  • "Embrace Unicode, don't fight it; it's probably the right thing to do, and if it weren't you'd probably have to anyhow.
  • Inside your software, store text as UTF-8 or UTF-16; that is to say, pick one of the two and stick with it.
  • Interchange data with the outside world using XML whenever possible; this makes a whole bunch of potential problems go away.
  • Try to make your application browser-based rather than write your own client; the browsers are getting really quite good at dealing with the texts of the world.
  • If you're using someone else's library code (and of course you are), assume its Unicode handling is broken until proved to be correct.
  • If you're doing search, try to hand the linguistic and character-handling problems off to someone who understands them."
Tue, 14 Oct 2003 10:42:45 +0000