Confusion over encodings, utf-8, etc.

Bill Hacker wbh at conducive.org
Wed Aug 30 01:12:49 PDT 2006


Jonathon McKitrick wrote:

I've entered the world of unicode with my mac, and I'd like to make things
consistent across my machines.
I naïvely thought utf-8 would solve everything.  But I find a lot of web pages
and people's names only show correctly in iso-8859-1.  I thought utf-8 covered
them all?
Jonathon McKitrick
--
My other computer is your Windows box.
Similar environment here. Couple of "gotcha's"

- Mac's own unicode has a couple of broken codes - legacy pre-OS X control-key 
compatibiity, IIRC. Best to use real UTF-8.

- Websites are *supposed to* call out their encoding, but not all do so 
correctly, consistently, or even at all. Nor are all available pages the same.

One that DOES ('coz I wrote it that way...) is here:

http://precisa.ch/Precisa/vert1_srch?Zone=as&SUBMIT=Submit+Query

Your page should show several dialects of Chinese, Korean, and Thai, as well as 
Western languages.

- Browsers have their own ways.

Taking Mozilla/Firefox as an example, rather than just setting UTF-8 as the 
default, you may need to list it (and others) in the 'auto-detect' options, then 
optionally leave ISO-8859-1 as the default.

I still commonly get alleged- UTF-8 pages that mis-convert '-' and quotes and 
such, but will clean-up if ISO-8859-1 is manually selected.

Bad page coding. I just live with it, as it doesn't affect much I can't 
auto-convert well enough with 'wetware'.

HTH,

Bill







More information about the Users mailing list