Discussion:
Which byte order should be used when using UTF16 BOM with ID3v23
Paul Taylor
2010-04-09 11:48:57 UTC
Permalink
Hi, ID3v23 doesnt support UTF8 but it does support UTF16 with BOM, i.e 2
bytes per character which can be either Most Significant Byte (MSB) or
Least Significant Byte (LSB) first as indicated by the BOM that can be
0xFF 0xFE or 0xFe 0xFF.

Trouble if you use 0xFF 0xFE it matches the pattern for synchronization
and if you do synchronize the tag then many applications dont understand
synchronization. Whereas if you use 0xFE 0xFF you dont need
unsynchronization, but I dont thinks Windows likes this byte order.

Anybody else had similar problems and come up with the best supported
solution.

thanks Paul
Mathias Kunter
2010-04-10 10:14:48 UTC
Permalink
Yes, unsynchronized tags aren't supported very well. However, de facto all software and hardware mp3 players support ID3 version 2 tags today, at least for skipping them if they're present within an mp3 file. It therefore shouldn't often be nescessary to unsynchronize an ID3 tag at all.

If you need to ensure compatibility with (old) software or hardware mp3 implementations which don't support ID3 version 2 tags and therefore actually scan the tag for a mp3 synchronization pattern, I would avoid using the 0xFF 0xFE byte order mark. You then may don't use any byte order mark at all and encode the string as big endian (as specified by the unicode standard), or explicitely use the big endian 0xFE 0xFF byte order mark - most applications which support UTF-16 should also be able to actually decode a big endian string!

Mathias K.





________________________________
Von: Paul Taylor <***@fastmail.fm>
An: ***@id3.org
Gesendet: Freitag, den 9. April 2010, 13:48:57 Uhr
Betreff: [ID3 Dev] Which byte order should be used when using UTF16 BOM with ID3v23

Hi, ID3v23 doesnt support UTF8 but it does support UTF16 with BOM, i.e 2 bytes per character which can be either Most Significant Byte (MSB) or Least Significant Byte (LSB) first as indicated by the BOM that can be 0xFF 0xFE or 0xFe 0xFF.

Trouble if you use 0xFF 0xFE it matches the pattern for synchronization and if you do synchronize the tag then many applications dont understand synchronization. Whereas if you use 0xFE 0xFF you dont need unsynchronization, but I dont thinks Windows likes this byte order.

Anybody else had similar problems and come up with the best supported solution.

thanks Paul



---------------------------------------------------------------------
To unsubscribe, e-mail: id3v2-***@id3.org
For additional commands, e-mail: id3v2-***@id3.org

__________________________________________________
Do You Yahoo!?
Sie sind Spam leid? Yahoo! Mail verfügt über einen herausragenden Schutz gegen Massenmails.
http://mail.yahoo.com
Paul Taylor
2010-04-10 21:16:03 UTC
Permalink
Hi Mathias
Post by Mathias Kunter
Yes, unsynchronized tags aren't supported very well. However, de facto
all software and hardware mp3 players support ID3 version 2 tags
today, at least for skipping them if they're present within an mp3
file. It therefore shouldn't often be nescessary to unsynchronize an
ID3 tag at all.
If you have an APIC frame its very likely to have bytes that fall foul
of the unsynchronization schema , and if you don't do Unsychronization
then that image WILL NOT display correctly in iTunes. So this is one
example where unsysnchronization is needed for newer software not for
the music to play okay, but for the metadata to display okay.
Post by Mathias Kunter
If you need to ensure compatibility with (old) software or hardware
mp3 implementations which don't support ID3 version 2 tags and
therefore actually scan the tag for a mp3 synchronization pattern, I
would avoid using the 0xFF 0xFE byte order mark. You then may don't
use any byte order mark at all and encode the string as big endian (as
specified by the unicode standard), or explicitely use the big endian
0xFE 0xFF byte order mark - most applications which support UTF-16
should also be able to actually decode a big endian string!
Ok I'll take another look at BE ( I thought it caused problems for
WIndows but perhaps my diagnosis was wrong) , but I don't think you can
just drop the BOM thats breaking the ID3 standard
Post by Mathias Kunter
Mathias K.
Paul
Mathias Kunter
2010-04-11 15:53:41 UTC
Permalink
Post by Paul Taylor
If you have an APIC frame its very likely to have bytes that fall foul
of the unsynchronization schema ,
Post by Paul Taylor
and if you don't do Unsychronization
then that image WILL NOT display correctly in iTunes. So this is
Post by Paul Taylor
one
example where unsysnchronization is needed for newer software not for
the music to play okay, but
Post by Paul Taylor
for the metadata to display okay.
Yes, it's a pain...
Post by Paul Taylor
Ok I'll take another look at BE ( I thought it caused problems for
WIndows but perhaps my diagnosis was
Post by Paul Taylor
wrong) , but I don't think you can
just drop the BOM thats breaking the ID3 standard

Ah yes, ID3 specifies that a BOM must be present (the ISO specification of UTF-16 doesn't - I remembered incorrectly). Well, Windows Media Player stores strings within ID3 tags as little endian strings (as most unicode strings on the Windows platform are stored), but I'm not aware of problems caused by big endian strings. I however also didn't test this with all common versions of Windows Media Player.

Mathias





________________________________
Von: Paul Taylor <***@fastmail.fm>
An: ***@id3.org
Gesendet: Samstag, den 10. April 2010, 23:16:03 Uhr
Betreff: Re: [ID3 Dev] Which byte order should be used when using UTF16 BOM with ID3v23

Hi Mathias
Post by Paul Taylor
Yes, unsynchronized tags aren't supported very well. However, de facto all software and hardware mp3 players support ID3 version 2 tags today, at least for skipping them if they're present within an mp3 file. It therefore shouldn't often be nescessary to unsynchronize an ID3 tag at all.
If you have an APIC frame its very likely to have bytes that fall foul of the unsynchronization schema , and if you don't do Unsychronization then that image WILL NOT display correctly in iTunes. So this is one example where unsysnchronization is needed for newer software not for the music to play okay, but for the metadata to display okay.
Post by Paul Taylor
If you need to ensure compatibility with (old) software or hardware mp3 implementations which don't support ID3 version 2 tags and therefore actually scan the tag for a mp3 synchronization pattern, I would avoid using the 0xFF 0xFE byte order mark. You then may don't use any byte order mark at all and encode the string as big endian (as specified by the unicode standard), or explicitely use the big endian 0xFE 0xFF byte order mark - most applications which support UTF-16 should also be able to actually decode a big endian string!
Ok I'll take another look at BE ( I thought it caused problems for WIndows but perhaps my diagnosis was wrong) , but I don't think you can just drop the BOM thats breaking the ID3 standard
Post by Paul Taylor
Mathias K.
Paul

---------------------------------------------------------------------
To unsubscribe, e-mail: id3v2-***@id3.org
For additional commands, e-mail: id3v2-***@id3.org

__________________________________________________
Do You Yahoo!?
Sie sind Spam leid? Yahoo! Mail verfügt über einen herausragenden Schutz gegen Massenmails.
http://mail.yahoo.com

Loading...