Trying to scale a Twitch (tmi.js) bot

Hello, I own a twitch bot with 9 bot instances, 8 custom instances (1 channel per instance) and another channel with 450+ channel instances. The bot is using tmi.js in node.js. Every so period, and now more regularly, one or more of the bot instances gets disconnected for not being able to ping twitch back intime resulting in disconnections, and the bot needing to rejoin every channel which can take 10 minutes to complete, which is frustrating.

I upgraded my DigitalOcean VPS to have 2 dedicated CPUs and 8 GB of RAM, but it still happens and is an issue and didn’t really do much to help but inefficiently increase bills. I don’t know what the best way to efficiently scale my bot is. I should also note that on top of the this, my PM2 manager show’s a very high P95 Event Loop Latency but a decent Event Loop Latency.

When about 3 of my 1 channel custom instances were disconnected, that were not receiving many event packets, the P95 Event Loop Latency decreased, which leads me to think that these custom instances are making a decent impact of the global performance of the application.

app.js — Ran on application startup

const db = require('quick.db'),
    channelNames = new db.table('channelNames'),
    instanceManager = require('./InstanceManager');

instanceManager.CreateInstance(require('./configurations/standard.json'), channelNames.get('channels') ?? [], 'standard');
// 450+ channels running on a single instance

instanceManager.CreateInstance(require('./configurations/cayke.json'), [ '#caykexd' ], 'cayke');
// About 8 of these, one channel custom instances.

InstanceManager.js — Called on the application startup, with a class function

const tmi = require('tmi.js'),
    chalk = require('chalk'),
    db = require('quick.db'),
    phin = require('phin'),
    instances = new db.table('instances');

let Instances = [],
    FailedAttempts = {};

function getToken (name) {
    return instances.get(`${name}.token`);
}

async function refreshToken (refresh, clientID, clientSecret, name, data) {
    let options = {
        'client_id': clientID,
        'client_secret': clientSecret,
        'refresh_token': refresh,
        'grant_type': 'refresh_token',
    };

    let compiled_options = new URLSearchParams();

    for (let key in options)
        compiled_options.append(key, options[key]);

    await phin({
        url: `https://id.twitch.tv/oauth2/token?${compiled_options.toString()}`,
        method: 'POST',
        parse: 'json'
    }).then(req => {
        instances.set(name, { token: req?.body?.access_token });
        new InstanceManager(data.config, data.channels, data.name);
    }).catch(error => {
        console.error(error);
    });

}

class InstanceManager {
    constructor(config, channels, name) {
        if (!config || !channels || !name)
            throw new Error('You must provide a configuration object and channels array.');
        else {
            this.config = config;
            this.channels = channels;
            this.name = name;

            const client = new tmi.Client({
                options: {
                    debug: false,
                    messagesLogLevel: "info",
                    skipUpdatingEmotesets: true,
                    skipMembership: true
                },
                connection: {reconnect: false, secure: true},
                identity: {
                    username: this.config.username,
                    password: getToken(name)
                },
                channels: this.channels
            });

            client.connect().catch(async (error) => {
                console.log(`${chalk.gray('[')}${chalk.red('!')}${chalk.gray(']')} ${error}`);
            });

            client.on("disconnected", (reason) => {
                console.error(`Disconnected for ${reason}`);
                if (!FailedAttempts[name])
                    FailedAttempts[name] = 1;
                else FailedAttempts[name]++;

                if (FailedAttempts[name] < 3) {
                    refreshToken(config["refresh-token"], config["client-id"], config["client-secret"], name, this)
                        .catch(error => {
                            console.error(error);
                        });
                }
            });

            client.on('message', async (channel, userstate, message, self) => {
                // require other file to do stuff
            });

            if (Instances.length !== 0) {
                for (let i = 0; i < Instances.length; i++) {
                    if (Instances[i]?.name === name)
                        delete Instances[i];
                }
            }

            Instances.push({name, client});

        }
    }
}

module.exports = {
    Instances,
    CreateInstance: (config, channels, name) => {
        return new InstanceManager(config, channels, name);
    },
};

What could I do to my code or implement to my deployment to make my bots scale, I’ve tried researching about clustering in tmi.js, but haven’t had much luck into finding what to do. I am willing to switch to something like AWS as have heard it can horizontally scale, but have never used it.

If you’d like more information about my deployment or application, feel free to ask.