I tried to build a list of all currently live streams by walking the paginated https://api.twitch.tv/kraken/streams/ endpoint until I run out of streams. However, the resulting list has a lot of duplicate streams, and the number of retrieved unique streams is thousands below the total number of live streams.
I expect that’s because the pagination gets me a snapshot of a constantly changing list of streams, so streams move between the pages as I’m stepping through them, and many of them deftly avoid ever being inside the page I’m currently querying.
Is there a better way to go about this, or am I missing something? It looks like there is no good way, the API would either need to give me a list of all streams (maybe just the IDs) in one go, or provide a cursor through a temporarily stable set of results. I guess I could repeatedly query the same page and accumulate streams until I feel like I got most of them, but that does not seem like the best use of anyone’s resources.
My first thought is “why” would you need that list in the first place… And yes, it’s very much like asking the question “What are the names of the pedestrians currently walking on this downtown block.” Very quickly, the list becomes obsolete and inaccurate as people move around.
What’s the core “question” you’re trying to solve?
I think it would be nice to get a look at streams who aren’t the top 100, or whatever, in their respective games.
Well, the approach of “must have it all” isn’t going to work very well in a constantly changing ecosystem like Twitch. You’re going to have to approach it like a surveyor or an advertiser and come up with some criteria/algorithm to take a decent “sample” of what’s going on that represents what you’re looking for the best.
This difficulty is incidental to the task at hand, it’s just an artifact of the interface. I’m not expecting that all those streamers in the result set continue streaming until I am done processing them.
Getting a consistent snapshot of a couple dozen thousand stream IDs from one program to another isn’t some unsolved problem of computer science.
Ok then, totally not going to be of help due to a disconnect on “what” you’re actually doing. “It’d be nice to have a list of not top 100”. Isn’t really an answer, and I tried to give a general response to that…
So, I guess… Be extremely specific as to what you’re trying to accomplish please.
Sorry if I’m being unclear, but I’m pretty much interested in discussing whether it’s possible with the current API, or should be possible with the API in some ideal state, to get a list of all the streams. It seems like the current implementation falls a bit short of the intention behind the API endpoint.
It isn’t designed around this. I suspect it is to prevent leaking twitch business performance. I.e. You could competitively analyze twitch’s business.
They’re obviously not going to give you that. Thus they’ll probably provide no official support for it.
It’s not really hiding anything. I mean there’s an endpoint for the total number of viewers and channels, and getting the first couple hundred streams for each game where like 90% of the viewers are isn’t really a problem, and if I was the competitor who for some reason really needed all the detailed data on every stream I wouldn’t mind repeatedly spamming requests for every page throughout the pagination until I got about as many unique streams as the total number of streams indicates I should get.
But, yeah, I guess the pagination is really not designed around requesting more than one or two pages.
This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.