I have a problem with the “emotes” tag in some messages. In the following message (user info edited out) for example the emote index for the Kappa emote is 15-19, however the (unicode) character at index 15 is a (the correct indexes should be 14-18):
@badge-info=;badges=bits/100;color=#FF0000;display-name=<snip>;emotes=25:15-19;flags=3-12:S.5;id=70960b2b-d54c-4389-bb71-ea8f574798aa;mod=0;room-id=55926254;subscriber=0;tmi-sent-ts=1596901739989;turbo=0;user-id=<snip>;user-type= :<snip>!<snip>@<snip>.tmi.twitch.tv PRIVMSG #artgameslp :Ya Rазвратник Kappa
Another example (indexes are 9-11 but should be 8-10):
@badge-info=subscriber/1;badges=subscriber/0,premium/1;color=#1E90FF;display-name=<snip>;emotes=425618:9-11;flags=0-6:S.6;id=1ef5ff6f-d235-47f2-aa1e-b9542b1aeff3;mod=0;room-id=98951626;subscriber=1;tmi-sent-ts=1596917331928;turbo=0;user-id=<snip>;user-type= :<snip>!<snip>@<snip>.tmi.twitch.tv PRIVMSG #nekzerr :turluté LUL
@badge-info=;badges=;color=;display-name=anahi_2334;emotes=58765:53-63;flags=21-25:P.6/S.6;id=c29fb36b-658b-4974-8206-de50af62b336;mod=0;room-id=175779148;subscriber=0;tmi-sent-ts=1596913673211;turbo=0;user-id=<snip>;user-type= :<snip>!<snip>@<snip>.tmi.twitch.tv PRIVMSG #luisormenoa27 :@zthegoin aya xd, yo pensé que era por la sala bebé NotLikeThis
The interesting thing is that this only happens in about one message out of every million.
Also, when I copy-paste those messages into my chat my program receives a message that is byte-identical (so no UTF8/16 conversion issues) but the indexes are correct.
What those three messages have in common is that they have a value for the “flags” attribute which is not documented anywhere. Also, when I copy-paste the message and I get the correct indexes, the flags attribute is empty. Unfortunately the flags attribute is not documented, so I don’t know if my problem is related to that.
I can rule out the following:
- My program counts the characters incorrectly. (My program can extract the emotes just fine for all other messages that contain non-ASCII characters)
- This is an issue caused by combining unicode characters or unicode normalization. (The first example contains no characters that are changed by normalization)
- This only happens for messages that have non-ASCII characters followed by an emote at the end. (I have several messages containing
<cyrillic text> Kappawhere the indexes are correct)
In the examples I found
flags=21-25:P.6/S.6. What does this mean? Is this related to the incorrect emote indexes?