Why decode to then turn around and re-encode?
Reading their product page, it seems like Recall captures meetings on whatever platform their customers are using: Zoom, Teams, Google Meet, etc.
Since they don't have API access to all these platforms, the best they can do to capture the A/V streams is simply to join the meeting in a headless browser on a server, then capture the browser's output and re-encode it.
They‘re already hacking Chromium. If the compressed video data is unavailable in JS, they could change that instead.
loading story #42069540
loading story #42068828
I had the same question, but I imagine that the "media pipeline" box with a line that goes directly from "compositor" to "encoder" is probably hiding quite a lot of complexity
Recall's offering allows you to get "audio, video, transcripts, and metadata" from video calls -- again, total conjecture, but I imagine they do need to decode into raw format in order to split out all these end-products (and then re-encode for a video recording specifically.)
my guess is either that video they get use some proprietary encoding format (js might do some magic on the feed) or it's because it's latency optimized stream that consumes a lot of bandwidth