Exact format for WebSocket request header?

Apologies for the in-depth explanation, but there’s a tl;dr at the end :slight_smile:
Ok, so in this lockdown boredom I’m essentially trying to teach myself how to build a twitch bot from scratch using only node.js’s net module(and the tls module… because security!) for the tcp connections. My first foray was to connect using the irc protocol. (irc.twitch.tv port 6667) which was nice and clean cut. But I’m trying to figure out how to get more in-depth information from the callbacks as well such as follows, subs, and point redeem events. Though this has lead down a rabbit hole of other questions which has me curious about how to connect in other ways.
Now I understand I have a couple options, WebSockets and Webhooks essentially, though that seems to be subdivided a bit(with… EventSubs being a future project?). I’ll add a second question topic in order to keep things concise about my confusion between the various connection endpoints(unsure this is the right word here.)
However, when it comes to connecting specifically to ws://irc-ws.chat.twitch.tv:80 I can’t seem to simply get the EXACT format of the header correctly. Here is what I can do so far in node.js:

const net = require('net');

data_buf = '';
host = 'echo.websocket.org';
port = 80;

var client = new net.Socket().connect({
  port: port,
  host: host
}, function() {
    console.log('Connected');
    client.write('GET / HTTP/1.1\r\n'
                + 'Upgrade:websocket\r\n'
                + 'Connection:Upgrade\r\n'
                + 'Host:' + host + '\r\n'
                + 'Origin:http://www.websocket.org\r\n'
                +  '\r\n');

});
client.on('data', function(chunk) {
    data_buf += chunk.toString('utf-8');
    let data_str = '';
    while(data_buf.indexOf('\n') >= 0) {
        data_str = data_buf.substr(0, data_buf.indexOf('\n'));
        data_buf = data_buf.substr(data_buf.indexOf('\n') + 1);
        console.log(data_str);
    }
});

which is a sample that works for the websocket echo at an external public site.

HTTP/1.1 101 Web Socket Protocol Handshake
Access-Control-Allow-Credentials: true
Access-Control-Allow-Headers: content-type
Access-Control-Allow-Headers: authorization
Access-Control-Allow-Headers: x-websocket-extensions
Access-Control-Allow-Headers: x-websocket-version
Access-Control-Allow-Headers: x-websocket-protocol
Access-Control-Allow-Origin: http://www.websocket.org
Connection: Upgrade
Date: Sun, 17 Jan 2021 07:58:23 GMT
Server: Kaazing Gateway
Upgrade: WebSocket
WebSocket-Location: ws://echo.websocket.org/
WebSocket-Origin: http://www.websocket.org

The problem occurs that if I attempt to switch to something similar for twitch:

const net = require('net');

data_buf = '';
host = 'irc-ws.chat.twitch.tv';
port = 80;

var client = new net.Socket().connect({
  port: port,
  host: host
}, function() {
    console.log('Connected');
    client.write('GET / HTTP/1.1\r\n'
                + 'Upgrade:websocket\r\n'
                + 'Connection:Upgrade\r\n'
                + 'Host:' + host + '\r\n'
                + 'Origin:https://www.twitch.tv\r\n'
                + 'Sec-WebSocket-Protocol:irc\r\n'
                + 'Sec-WebSocket-Version:13\r\n'
                +  '\r\n');

});
client.on('data', function(chunk) {
    data_buf += chunk.toString('utf-8');
    let data_str = '';
    while(data_buf.indexOf('\n') >= 0) {
        data_str = data_buf.substr(0, data_buf.indexOf('\n'));
        data_buf = data_buf.substr(data_buf.indexOf('\n') + 1);
        console.log(data_str);
    }
});

and it turns out to be a Bad Request:

HTTP/1.1 400 Bad Request
Date: Sun, 17 Jan 2021 08:05:17 GMT
Content-Type: text/plain; charset=utf-8
Content-Length: 12
Connection: keep-alive
X-Content-Type-Options: nosniff

Bad Request

now at this point I know there’s going to be a LOT more involved with creating a raw websocket client from “scratch”(this ain’t assembler or something, I know :wink: ) but I figure I can probably get to the next step if I can just figure out(or better yet be linked to where in the docs it lists) the precise formatting I need in order to send an entire request header that twitch is happy with :slight_smile: (Bonus thank you’s if you can do this for pubsub-edge.twitch.tv as well!)

tl;dr → just need the precise formatting for an entire request header for the twitch websockets =)

Because it doesn’t use headers. There are no headers documented required to be sent

Here is an example for chat and pubsub, using only nodeJS we as a dependancy.

You’ve come at this from the complete no library end of this…

Hey Barry, really appreciate you taking the time to answer :slight_smile:

You’ve come at this from the complete no library end of this…

Yea, there really needs to be an elmofire gif inserted right about here →

Ok, so there’s some confusion somewhere and I only just got up and haven’t made coffee yet so bear with me a bit hehe. The confusion I figure is probably on my part but I’ll run through my understanding of the workings of things:

links to the WebSocket wikipedia entry which is the RFC 6455(RFC 6455 - The WebSocket Protocol) specification of the WebSocket protocol itself which says the initial handshake of a WebSocket itself is an “upgraded” http connection I think that is supposed to include a first set of request headers:

Upgrade: websocket
Connection: Upgrade

which should be mandatory, and expects a response of at least a Status Code of 101 (the Switching Protocols response) and this opens the underlying tcp connection to a full-duplex communication mode rather than http’s usual stateless mode. The WebSocket library used in your examples automatically does those for you as part of the spec
(I really hope I got those right lol)

Now I’m not sure if this is necessary to connect to ws://irc-ws.chat.twitch.tv and now that you say that I’ll check when I get back to my linux box =) only need the get request though? That’d be… odd

OH, but… in retrospect the WebSocket library is probably built on top of the http library and probably includes the request headers in the connection data, I’ll check back in if I discover anything.

The upgrade stuff only applies if you try to connect to a http server first then the server goes “you need to upgrade from http(s) to ws(s) instead”, I’ve not checked as I don’t care, but Twitch PubSub/IRC are both “straight socket servers”.

The connection strings for Chat and IRC are both “ws(s)” and you can’t really upgrade from “ws(s)” to something else.

I wish it were that easy! Hehe, the irc server at irc.twitch.tv port 6667 is a “straight socket”, which would be a generic tcp socket connection and that’s fairly easy to get the data from:

var net = require('net')

var client = net.connect({
  host: 'irc.twitch.tv',
  port: '6667'
}).on('ready', () => {
    console.log('ready');
}).on('connect', () => {
    console.log('connect');
    client.write('PASS oauth:nopass\r\n');
        client.write('NICK justinfan123\r\n');

        client.write('CAP REQ :twitch.tv/commands\r\n');
        client.write('CAP REQ :twitch.tv/tags\r\n');
    client.write('JOIN #twitch\r\n');
}).on('data', buf => {
    console.log(buf.toString('utf8'));
});

No WebSocket needed, this is using node’s net module which is just a mostly a TCP connection, actually the http module is built on top of the net module as well.
Now a WebSocket is a completely different beast and in order to see that this isn’t the same thing all you need to do is replace it with irc-ws.chat.twitch.tv port 80(specifically this is a ws connection, NOT wss, wss is secure and will require the tls module instead of the net module along with generating security certificates). Notice that I’m leaving out the ws:// in front. In the anatomy of a url the “wss://” or “http://” dictate the protocol used, but if only using the underlying transport layer which is tcp in this case, it’s up to the programmer to deal with the incoming protocol per its specs.
In order for something to actually be a WebSocket it requires an http handshake which itself means that request headers MUST be sent and then receive a response of 101 (the Switching Protocols code). From the specs:

1.2.  Protocol Overview

   _This section is non-normative._

   The protocol has two parts: a handshake and the data transfer.

   The handshake from the client looks as follows:

        GET /chat HTTP/1.1
        Host: server.example.com
        Upgrade: websocket
        Connection: Upgrade
        Sec-WebSocket-Key: dGhlIHNhbXBsZSBub25jZQ==
        Origin: http://example.com
        Sec-WebSocket-Protocol: chat, superchat
        Sec-WebSocket-Version: 13

   The handshake from the server looks as follows:

        HTTP/1.1 101 Switching Protocols
        Upgrade: websocket
        Connection: Upgrade
        Sec-WebSocket-Accept: s3pPLMBiTxaQ9kYGzzhZRbK+xOo=
        Sec-WebSocket-Protocol: chat

also better at describing than me:

### [1.7](https://tools.ietf.org/html/rfc6455#section-1.7). Relationship to TCP and HTTP

_This section is non-normative._ The WebSocket Protocol is an independent TCP-based protocol. Its only relationship to HTTP is that its handshake is interpreted by HTTP servers as an Upgrade request.

Basically, it uses http requests for the upgrade and then converts to that upgrade.

I’ll have to browse the package a bit more to see if I can get the request headers, but the response headers are fairly readily available for your connection in your examples. Adding a new listener of

.on('upgrade', response => {
        console.log(response.rawHeaders);
});

to the socket Object in your github examples above will show the response headers that the http connection is sending to confirm the upgrade from http→WebSocket.

Slight misunderstanding there, not looking to upgrade from a WebSocket(wss) to anything, I’m looking for the header format of the request in order to upgrade from HTTP to a WebSocket protocol, though it appears I may have the answer available to me now and I can start testing that. The ws library contains a “.protocol” entry for a client connection, and since as your example shows it’s pretty easy to connect using the library:

const WebSocket = require('ws');
socket = new WebSocket('wss://irc-ws.chat.twitch.tv');

socket.on('close', () => {
        console.log('Closed');
}).on('open', () => {
        console.log('Opened');
        console.log('Send Conn stuff');

        socket.send('PASS oauth:nopass');
        socket.send('NICK justinfan123');

        socket.send('CAP REQ :twitch.tv/commands');
        socket.send('CAP REQ :twitch.tv/tags');
    socket.send('JOIN #twitch');
}).on('message', (raw_data) => {
        console.log(raw_data);
}).on('upgrade', response => {
        console.log(response.rawHeaders);
});

all else fails I’ll use the ws library to create a WebSocket server and check what the the connection looks like on that end =)

I’ll see how far I can get from here, thanks for your time and patience :slight_smile:

You are really over complicated it here.

Just speak IRC if talking to the IRC ports
And speak websockets if talking to websockets.
Anything else you are just spinning wheels for no reason

The real question is why are you trying to over complicate things and mess about with headers for no reason?

yep, I know I’m over-complicating the entire process here. I could just grab a WebSocket library and be done, or I could use tmi.js and be done(believe me, I referenced it a LOT during my decoding of the IRC messages with my first run through of just using irc). I’ve done all that and as you can see I have no problem accessing it using any of those methods or even just the net module as I did above. The point is not necessarily to understand how to use someone else’s answer to the problem, but rather for me to deepen my understanding the of the underlying layers to speak with it directly.

I know it’s difficult, I know it’s over-complicated and I know it’s a royal pain. That’s also entirely the point of me trying to do it this way! =) I’ve written lots of bots with the above methods. I’m not as interested in that anymore =) That’s why I was hoping someone might’ve had access to a quick copy/paste of a good header request to the address that twitch is happy with right off the bat, in order to save my fooling around with the minor nuances of the specific request(one thing off can easily throw a 400 Bad Request)

Not too my knowledge since websockets is an accepted standard to the point that WS is a “standard” lib that is just in browsers

Like fetch, websockets are just built into browsers.

You are trying to “reinvent the wheel” where people who wrote the spe, also wrote the “accepted” library to go with it.

IRC parsing is relatively mature and regexes, or tokensiers have been around for years.

well, yes and no, Browser compatibility can still be spotty though these days it IS significantly more common.

I’m not trying to reinvent the wheel, I’m trying to understand the underlying layers, the entire point is to learn. Is it… a problem for someone to want to do that? It’s sure seeming more like you are upset that I’m asking the question at this point when all I’m doing is asking a simple question on a technical forum: If someone has the syntax for the header in a format that they know Twitch’s WebSockets like.
And note on that: the ws library you’re using is NOT the “accepted” library, node.js specifically does not have an accepted WebSocket library, ws is a 3rd party library from npm that was CLONED from the built-in WebSocket library for browsers.

I’m aware, been on IRC since for decades. I am unsure why you bring up the idea of tokenizers and regex. Note that in this case you use regex for the tokenizer and not the PARSER for the irc messages. In my case tmi.js was useful for breaking down twitch’s particulars, and it still is getting away from my original question about the websocket request header, unrelated to irc.

So instead of this weird go-around that we’re doing here, I’m going to try to cut through it all and just get back to the question for anyone else coming along around here:

Does anyone have to have a sample or the exact header format Twitch likes for its WebSocket connections to pubsub-edge.twitch.tv or irc-ws.chat.twitch.tv? :slight_smile:

No not upset at all. Just unsure as to Why you are coming at it from this direction, when people more clever than you are I are building libraries to manage the “basics” of opening a connection, and if I wanted to learn I’d go to the WS code rather than trying to connect to Twitch.

If you want to learn how to essentially build a library that connects to a WebSocket, you’d be better suited looking at how libraries such as WS and how they follow the specification, work as apposed to looking at Twitch.

I was talking in general, not nodeJS specifically.

On the matter of nodeJS, I recal reading there is talk of bringing WS into node “natively” so it’s not a module you have to install seperately.

You can continue down this path, but I doubt anyone has an answer. Since they don’t come at this problem (connecting to chat/pubsub via websockets at the rarest/basest level). And my point is that Websocket servers should all work the same, regardless of whom you are connecting to, so you are better looking at how the WS libraries (or similar) work to follow the spec, rather than poking about at Twitch.

EDIT

A quick look in inspector returns the request headers, that the browser WS library (in chrome) called to create the connection(s)

Chat headers:

PubSub headers:

EDIT 2 So I went and looked up the specification.

You can also refer to the specification, section 4 on page 17

Which also covers the required (and optional) headers.

The required header you’re missing is Sec-WebSocket-Key. Adding that you should get 101 Switching Protocols. Origin is not necessary, and unnecessarily misleading for the server.

2 Likes

Excellent! Direct and to the point, much appreciated =)
Since you seem to know what I’m talking about here I would be curious as to the technical reason why the origin isn’t necessary and is “unnecessarily misleading for the server”. I don’t know enough about the requirements of these headers yet so I’m super curious :slight_smile:

For anyone coming around later and needing to know what the “Sec-WebSocket-Key” header is:
it’s a 16 byte base64 nonce which is just a base64 random string. Base64’s are specifically an encoded string composed of(a-z, A-Z, 0-9, +, /) and that has a length that’s a multiple of 3 with any additional space needed being padded with a “=”(so this one will always end with a ‘=’). Here’s a useful function to generate a nonce for it:

function generatePaddedBase64String(length) {
    let base64string = '';
    let base64accepted = 'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/';
    for (let i = 0;i < length;i++) {
        base64string += base64accepted.charAt(Math.floor(Math.random() * base64accepted.length));
    }
    let paddingLength = base64string.length % 3;
    for(let i = 0;i < paddingLength;i++) {
        base64string += '=';
    }
    return base64string;
}

Probably an argument their for performance but that ain’t why I’m here currently! lol

Appreciate the help @3ventic! :+1:

For example, you set the origin to Twitch.tv, but your code isn’t running on the website Twitch.tv it’s running from your Command Prompt.

It’s misleading as basically “you lied (or can lie) about where the code/client is connecting from”. See also section 4.2.1 (subsection 7) page 20.

Some Servers will validate the origin header to ensure it’s from an accepted/permitted location (or origin). (4.2.2 sub section 4 page 22, and section 10 (security considerstions page 50)).

As per the linked RFC/speification

The request MUST include a header field with the name |Origin| [RFC6454] if the request is coming from a browser client. If the connection is from a non-browser client, the request MAY include this header field if the semantics of that client match the use-case described here for browser clients. The value of this header field is the ASCII serialization of origin of the context in which the code establishing the connection is running. See [RFC6454] for the details of how this header field value is constructed.

It’s required for Browser Clients, but not for “desktop/command line clients”

Please refer to the WebSocket Protocol RFC/Specification I linked. The RFC document covers how a client should construct and connect to an RFC compliant server. And the general meanings/usage of each header in a client to server configuration and describes the responses the server will give.

Item 7 on page 18 of the websocket specification also describes this. But doesn’t provide a code example. Since it’s just an RFC.

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.