502 Bad Gateway on API Calls causing up to 60% failure rate for cron jobs

I run a vod archive service that is spoiler-free, properly sorted and feature rich.

This means I make a lot of API calls to constantly keep the archive updated. This is only for about 10 twitch channels. I check for new vods frequently and do a full update a few times a day.

Checking for new vods requires 1 api call. A full update requires many api calls per channel until all vods are cataloged.

I’ve recently made a statistics tool to see when my cron jobs fail and for what reason. Some of my full update jobs have a 60% failure rate due to bad gateway error from Twitch. A failure is by my definition a failure to fetch all the data regarding a specific channel. These channels have from 50 to 4000 vods.

The response I get is: 502 Bad Gateway - The server returned an invalid or incomplete response.

Is this a problem with the Twitch servers or is it some undocumented error? The only API restrictions I know of is to not make more then 1 call per second and I have configured my scripts to avoid this as much as possible. It might happen now and again if more than one job is called at the same time.

During peak hours, Twitch’s API is very unstable. If you take a look around the forums, you’ll see this is a rather common occurrence. It’s not known if Twitch is working to actively stabilize the API, but it has been getting increasingly worse with the start of weekend events like Riot Games and ESL.

Yes our API is flaky, yes we are working on increasing it’s stability. Most of our apps have retry logic to minimize the impact of a request failing, I’d recommend the same approach for your cron jobs. Just make sure to cap the number of retries and use exponential backoff and you should be fine.

Ah, thank you for clearing this up.
I have a single retry per failure with a 1 second delay in between at the moment.

Much appreciated!

This topic was automatically closed 10 days after the last reply. New replies are no longer allowed.