Devlog: fixing my WebRTC for chrome

posted on 24.02.2025

Welcome back to Morj’s epic battle against web standards and insufficient documentation. You may remember our journey starting by writing a web page with a single button that somehow managed to break in chromium, and work perfectly on the first attempt in safari. And now, our adventure of complaining about google and about web standards, is continuing!

If a webrtc connection is not being established in chrome in the final app, but we started with a stolen minimal working prototype, the reasonable idea would be to go back to the prototype and see when it stopped working. Right? Turns out, it didn’t actually work in chrome back then too! What’s better, it didn’t work in firefox neither! What! I went through several commits from the first, and none of them worked, bizzarely. How the hell did I manage to write it?

With this I decided to vibe-debug it: look at suspicious places in the code, change them and see what happens.

My first culprit was the difference in handling of ICE candidates.

That was wrong. The current culprit is this: I remember that with webrtc you can’t negotiate a connection unless it has data streams attached. So what if in chrome you can’t negotiate a connection even with a data stream and you need something else entirely? This is supported by some stackoverflow answers where they add random bullshit to peer connection. So I tried to add microphone stream to watcher side, then to both sides - still nada.

The next step is to lay in despair for two months.

So, it’s about 10 years later and I got the motivation back. Here’s what I tried then: I tried finding other minimal webrtc examples and seeing if they work in chrome: they don’t. For some reason I also asked chatgpt to write one, and actually - it’s pretty minimal! And I only had to kick it to the side a couple of times for it to start working. But not in ungoogled-chromium anyway.

I got attached to the chatgpt minimalest example and tried a lot of different tweaks, and in the end I thought: what if it’s ungoogled-chromium’s fault? And you know what? You fucking know what? Yes, it was. Holy shiiiiiiit.

The worst part about it is that while debugging way back then I found bug reports saying that webrtc doesn’t work in ungoogled-chromium. But I also found the same bug reports saying webrtc doesn’t work in chrome proper as well. So nothing gave it away.

And an even worse part than that? Pokesz (the name I chose for my super app) - it still doesn’t work, only chatgpt’s example does. And pokesz with some random tweaks I did those months before despair ago - it works. I don’t remember what they do: one of them adds a microphone, other changes some ICE handling, the others I didn’t document here; nor did I actually commit them, I just have a bunch of changes in my git worktree. So now all that’s left is, as I wanted at the beginning, to figure out the difference between the working example and a non-working one. Finally, I’m on my fucking way to the light at the end of the tunnel.

Remember when this project started as a small step to then test out erlang and elixir? Oh my.

Still being attached to the minimal chatgpt thingy, I found out that it itself is also flaky: it works when chrome offers video first, and firefox offers video second - then both parties get the video. I wonder what happens if I try chrome with chrome, but I don’t wonder about it too much.
This tells me the problem is in unidirectional connections, and what do you know: I can google some scarce results about this! Fuck me, why is chrome so difficult.

This evidence tells me that the thing of all the changes to pokesz that worked was adding the microphone. It actually sucks because I don’t want the watchers to allow the microphone, this might seem like a breach of privacy to them, right? Especially since the intended users are the people who are not going to read the sources.

So I had a genious idea: what if I add a media and add its stream to the connection? To avoid asking for a microphone, but still have a media attached to the connection from this side. Will this work? The answer is: I get fucked, it still doesn’t. Whyy.

And finally, in a stroke of luck, I figured it out: turns out the streaming started, but chrome simply silently didn’t play it, because of its autoplay protection. The policy is that it doesn’t play video without user interaction unless it’s muted. And you would think that when I added the microphone, the prompt for allowing the page to use it is what counts for interaction, right? WRONG, it’s the fact that microphone was allowed at any point! If you give a wildcard microphone permission to localhost as I did, it just counts as “having interacted with the page” even when I haven’t interacted at all.
And the worst part is that this video that I’m trying to autoplay completely lacks an audio stream, but that apparently doesn’t count as muted still. God damn it.

Also forgot to mention, but at one point I realized that I handled null ICE messages incorrectly: you have to signal ICE ending somehow; and you can’t simply send a null message itself, because it’s invalid, you have to craft your own signal and then a special faux-ice candidate. Genious API, right.

And so I went and tested it out on my friends. The most requested change was: adding a button to stop the stream. And logically, a button to restart the stream as well. Fun fact: webrtc itself doesn’t communicate stream stopping for whatever reason, the other party only knows that you aren’t going to send any more data by waiting until a timeout. The official recommendation is to send this information through your signalling channel. Which is a bit crazy, right? Alternatively I could establish a data channel within the same webrtc connection and send one bit when I stopped streaming for the receiver to close all other streams. Nah, it’s not important enough, so timeout it is. This means I currently have a race condition in the watcher where a “paused” label may mis-update if the stream was restarted. For some reason it never triggered, even though the timing for that is very generous is theory. Well whatever. I hate webrtc and I hate webdev, fuck it.

Next I would like to somehow add sound support. Did I already mention that MediaStreams api says that the sound of the application should be captured? And that MDN explicitly says that no browser follows this specification? I did say that in the last chapter and I still find it a giant pain in the ass. I hate webdev.
The alternative I have is creating a virtual microphone in linux that loops the program output and can present this stream to the browser. This is not really clean and it won’t work for anyone other than me (my customers are computer illiterate, remember [and use windows (which is the same thing)]). So I leave this for chapter three question mark?

Anyway, you can now go play with it at https://random.test.morj.men. At some point it’ll go to a proper address. Or you can set up your own instance with https://git.morj.men/morj/pokesz. Have fun! I’m off to rewrite the server in Erlang. Finally, a worthy task! My name as morj and you have been webdeveloped.

Next posts

…or you can find more in the archives.