Can't calculate offset from the "emotes" tag if the message contains emojis

I send this message

πŸ‘©β€β€οΈβ€πŸ’‹β€πŸ‘© Kappa

And receive this

@ /* ... */ emotes=25:6-10; /* ... */
:dmitryscaletta!dmitryscaletta@dmitryscaletta.tmi.twitch.tv
PRIVMSG #dmitryscaletta :πŸ‘©β€οΈπŸ’‹πŸ‘© Kappa

It says that emote with id 25 starts in 6th position.

I can’t just use length because the length of this string in js is 9.

'πŸ‘©β€οΈπŸ’‹πŸ‘© '.length === 9

So I found this solution - https://stackoverflow.com/a/54369738/4687416

It works for most of the cases with emojis but not for this one.

fancyCount('πŸ€” ') === 2 // ok
fancyCount('πŸ‘©β€οΈπŸ’‹πŸ‘© ') === 5 // expected 6

However twitch chat works fine with this (screen from React Dev Tools).

How do I calculate the correct offset in JavaScript?

As far as I know the position in a string given by Twitch is based on Unicode code points. As JavaScript uses UTF-16 instead of UTF-32 JavaScript may use 2 code units for one code point which means one β€œcharacter” can have a length of 1 or 2. It gets more complicated when you want to count an emoji as a length of 1 because they may actually be made up of multiple code points which is what your linked stackoverflow question is trying to do. Twitch simply counts the amount of code points.

Here’s roughly what I do:

let get_codepoint_to_codeunit_map = function(string){
    let array = [];
    let count = 0;
    
    for(let char of string){
        array.push(count);
        count += char.length;
    }
    
    return array;
};

When you iterate with β€œfor let X of Y” JavaScript gives you single code points which have a length of 1 or 2 (code units) which I use to count the length of the previous code points. When you put in your string it will return an array. When Twitch says an emote starts or ends at index 6 you check index 6 of that array which in this case has the value 9. The character at index 9 in JavaScript is the K of Kappa.

I hope this makes sense and works for you :slight_smile:

1 Like