UTF8 locale MFC for DragonflyBSD

Joerg Sonnenberger joerg at britannica.bec.de
Mon Mar 29 08:48:30 PST 2004


On Mon, Mar 29, 2004 at 11:09:08AM -0500, Dave Cuthbert wrote:
> My personal opinion: UCS-4 wastes a lot of space given that Unicode 3.1 
> is a ~21-bit set and nobody is really using the >=U+10000 space in a 
> practical manner (yet?).  But if you need to have a one-to-one mapping, 
> you don't have much choice.

IIRC there are already some scripts outside the base plane. Anyway, nobody
forces you to encode anything in UCS-4, use UTF-8 or UTF-16 for that. But
if you need to have a hard-wired assumption about the size of a "character",
4 bytes is much more reasonable then 2 bytes in the mid to long term.

Joerg

> 
> Unless you have a machine which uses 21-bit bytes, of course. ;-)





More information about the Submit mailing list