Confusion over encodings, utf-8, etc.

Jonas Trollvik jontro at gmail.com
Wed Aug 30 04:42:28 PDT 2006


Im experiencing troubles with nfs charset conversion too.
Files with swedish characters doesnt show up correctly on my macbook
when mounting over nfs. I've set up the nfsd as explained in freebsd
manuals. Is there a way to set the charset conversion?
//Jonas

On 8/30/06, Bill Hacker <wbh at xxxxxxxxxxxxx> wrote:
Chris Csanady wrote:

> On 8/29/06, Jonathon McKitrick <jcm at xxxxxxxxxxxxxxxxx> wrote:
>
>>
>> I've entered the world of unicode with my mac, and I'd like to make
>> things
>> consistent across my machines.
>
>
> Are you interested in utf-8 for filesystem encoding as well?  This is
> where utf-8
> becomes painful.
Yes - it almost justifes Linux.  But not quite. D'ruther have file loss...

>  It turns out that there are several normalization
> forms for
> it, where Windows uses NFC, and and MacOS uses NFD.  The difference
> being that identical characters can be stored in different ways.  For
> example,
> for a character with an umlaut, the umlaut can optionally be external.  See
> http://www.unicode.org/reports/tr15/index.html for more information.
>
ACK.

> Now, the mac expects NFD, and the Finder is completely incapable of
> dealing with NFC.
Doesn't seem to be a 'deadly' problem on UFS mounts. Finder handles, for example
Chinese filenames in UTF-8, displays correct Chinese characters.
(I haven't had an HFS or HFS+ mount for years).
The 'terminal' CLI shows them, OTOH, as escaped numeral sequences, as does my
editor-of-choice (joe).
My current workaround is to use 'TextEdit' - set to UTF-8, to work with such
files, ELSE a WP (SO/OO, NeoOffice).
 > Trying to move those files around over NFS can result
> in them simply dissappearing!  Likewise, NFD encoded files visible over
> SMB on a Windows machine are not displayed properly.
>
Never saw the usefullness of NFS.  The only fs I use besides UFS is
HPFS/HPFS-386, so SMB is the better tool for me, as the OS/2 and MSDOS NFS is
even more broken than normal...
> Unfortunately, the only portable option is to use SMB exclusively to export
> NFC encoded files from a unix box.  In this case, the SMB client on the mac
> handles the renormalization.  If you use NFS exclusively with a mac, note
> that some non-native applications like Azureus will still write NFC encoded
> filenames which must me converted manually.
>
ACK, but not an issue here.

> If you have lots of files to translate, there is a convenient tool
> available in
> pkgsrc: converters/convmv.  It allows you to recursively renormalize or
> transcode a set of files from/to anything supported by iconv.  It needs
> to be
> used over a fs which does not mangle characters though, so something
> like NFS or a local filesystem.
>
> Chris
Well -we all hope the DFLY project will help make soem of these issues less
painful.  On OS X, perhaps ZFS, TVFS, or AFS could provide some relief? We have
begin revisiting certain of the 'Plan 9' features, as it runs well as a guest on
OS X.
Meanwhile we 'share' files when we must via restricted https, rsync, scp, etc.

'Sharity Lite' has helped.  Seems to be SMBFS at one end, NFS at the other,
though it did mangle timestamps when backing up FAT and NTFS Winboxen to a UFS
mount. Mounting the storage partition as MSDOS might have been all we needed to
do to get aroudn that, thugh hardly a universal solution.
With rsync and scp, binary is binary, so the problem has to do only with
filenames for us. Hong Kong, our HQ, is of course, very much prone to Chinese
encoding - unfortunately seldom in UTF-8 anyway.
YMMV,

Bill







More information about the Users mailing list