Unicode

Dale Preston

2007-02-12 03:13:00 UTC

Can't Unicode use either little-endian or big-endian?

This is an area where I have no experience to draw on and has definitely
been a struggle in my library.

Dale

-----Original Message-----
From: Jud White [mailto:***@cdtag.com]
Sent: Sunday, February 11, 2007 8:32 PM
To: ***@id3.org
Subject: Re: [ID3 Dev] Unicode

Just tested this.. if you're writing the BOM reversed (should be 0xFF
0xFE) you'll get oriental characters in iTunes.

Mark,
This isn't a UCS-2 vs UTF-16 issue. The differences in these two only
occur over 0xffff. Also it's not an issue with BOM since iTunes can
cope without BOM.
I was able to reproduce this behavior by writing a text encoding byte
of "Unicode" (0x01) but writing the actual string in UTF-8. Maybe
your implementation is doing something similar?
-Jud

I'm getting a bit exasperated with trying to handle Unicode
correctly. In my library, I'm handling all strings as UTF8
internally, but since the 2.3 spec (as I've understood it) only
allows for iso 8559-1 and UCS-2 (for the moment I'm treating UCS-2 as
if it were UTF-16), I'm writing out as UTF-16, where necessary.
What I'm finding is that if I write out a TALB frame as "Erét" (thats
E - r - e with acute accent - t, if your mail client displays
something else) as UTF-16, iTunes and the other two tagging apps I've
checked out display it in an oriental font.
So the question is, am I wrong, or are other people just not
bothering to deal with anything but english?
Any insights gratefully recieved....
Thanks,
Mark
---------------------------------------------------------------------

---------------------------------------------------------------------

---------------------------------------------------------------------
To unsubscribe, e-mail: id3v2-***@id3.org
For additional commands, e-mail: id3v2-***@id3.org