Discussion:
Non-destructive Improvements to ID3v23
Paul Taylor
2010-04-12 15:26:09 UTC
Permalink
The following features of ID3v24 could be used in ID3v23 without causing
much of a problem, Im proposing that we should document ID3 V23.1 with
these additions. At the moment developers and users are a stuck with
the common consensus being that applications should still use ID3v23
rather than ID3v24, but use of ID3v23 makes certain things more
difficult then they would be if using ID3v24.

1. All text information frames supports multiple strings, stored as a
null separated list, where null is represented by the termination code
for the charater encoding

This would allow things like multiple genres, and apps that didnt
understand multiple values would still continue to read just the first
value.

2. Frames that allow different types of text encoding contains a
textencoding description byte. Possible encodings:

$00 ISO-8859-1 [ISO-8859-1]. Terminated with $00.
$01 UTF-16 [UTF-16] encoded Unicode [UNICODE] with BOM. All
strings in the same frame SHALL have the same byteorder.
Terminated with $00 00.
$02 UTF-16BE [UTF-16] encoded Unicode [UNICODE] without BOM.
Terminated with $00 00.
$03 UTF-8 [UTF-8] encoded Unicode [UNICODE]. Terminated with $00.

This would allow UTF8 to be used, which is much less problematic for
applications, there is no BOM to consider, its more space efficient, and
even apps that cannot decode the UTF8 will display a better
approximation of the right value than if they cannot read UTF16.


Paul
Paul Taylor
2010-04-13 09:42:15 UTC
Permalink
Post by Paul Taylor
The following features of ID3v24 could be used in ID3v23 without
causing much of a problem, Im proposing that we should document ID3
V23.1 with these additions. At the moment developers and users are a
stuck with the common consensus being that applications should still
use ID3v23 rather than ID3v24, but use of ID3v23 makes certain things
more difficult then they would be if using ID3v24.
1. All text information frames supports multiple strings, stored as a
null separated list, where null is represented by the termination code
for the charater encoding
Foobar2000 does this already, you add multiple values to any text field
by separating them with ; but they get written seperated with a null
char, by default it writes IDv24 tags but if you have
AdvancedTagging/MP3/ID3v2 writer compatabilty mode set it writes ID3v23
, but still separates values by a null char (as in the v24 spec) . The
advantage of this method is it doesnt break applications that only
understand the one value, they just take the first one.

Paul
Peter Bennett
2010-04-18 19:26:52 UTC
Permalink
I recall some time back I had some frames encoded with $03 UTF-8 and
ID3v23. Unfortunately, windows media player refused to play the song at
all, with some message about an unsupported file type. This is very
unfriendly of Microsoft, you should be able to play an mp3 even if the
tag is not valid. However this rather nixes the idea of incorporating
this type of change to an existing standard.

Peter
Post by Paul Taylor
The following features of ID3v24 could be used in ID3v23 without
causing much of a problem, Im proposing that we should document ID3
V23.1 with these additions. At the moment developers and users are a
stuck with the common consensus being that applications should still
use ID3v23 rather than ID3v24, but use of ID3v23 makes certain things
more difficult then they would be if using ID3v24.
1. All text information frames supports multiple strings, stored as a
null separated list, where null is represented by the termination code
for the charater encoding
This would allow things like multiple genres, and apps that didnt
understand multiple values would still continue to read just the first
value.
2. Frames that allow different types of text encoding contains a
$00 ISO-8859-1 [ISO-8859-1]. Terminated with $00.
$01 UTF-16 [UTF-16] encoded Unicode [UNICODE] with BOM. All
strings in the same frame SHALL have the same byteorder.
Terminated with $00 00.
$02 UTF-16BE [UTF-16] encoded Unicode [UNICODE] without BOM.
Terminated with $00 00.
$03 UTF-8 [UTF-8] encoded Unicode [UNICODE]. Terminated with $00.
This would allow UTF8 to be used, which is much less problematic for
applications, there is no BOM to consider, its more space efficient,
and even apps that cannot decode the UTF8 will display a better
approximation of the right value than if they cannot read UTF16.
Paul
---------------------------------------------------------------------
Loading...