UTF-8 and local charsets in the VFS layer

Joerg Sonnenberger joerg at britannica.bec.de
Fri Mar 12 08:36:39 PST 2004


Hi all,
what is the attitude of the DragonFly community towards choosing UTF-8
as output character set for all those filesystems which can support
multiple?

We are currently in a situation, where the kernel doesn't care
for the local filesystem, has/could have some conversion hooks for 
removable and remote filesystems. E.g. ISO-9660 with Joliet or RR extension
can be the classical ISO-8859-1/15 most of us supposedly are using,
UTF-8 / UCS-2 or even other character sets. The situation is even somewhat
different for UDF which I have a working version of. UDF natively provides
UCS2 (16 bit Unicode) which needs to be converting anyway.

For the normal English speaking community, nothing will change. Having
a _semantically_ default for UTF-8 doesn't increase the work for the
kernel in the near feature, but it can make decisions for filesystems
much more coherent. Choosing UTF-8 over other representations of Unicode
has the advantage of being 100% API compatible, the which situation that
can happen is an application showing garbage filenames. But that can
happen now too :)

To give some hints about the others, Windows NT+ has natively Unicode
support, MacOS X has and Plan 9 (hehe) has it too.

Joerg





More information about the Kernel mailing list