Hacker News new | past | comments | ask | show | jobs | submit

The AV2 Video Standard Has Released (Final v1.0 Specification)

https://av2.aomedia.org
A few things - this is one step in a long, LONG path. AV2 is currently unusable in its current state (the encoder typically runs at around 1fps on good hardware), and likely will remain so til ~2028 when the first av2 hardware accelerated chips start dropping. Even then, I wouldn't expect AV2 streams to be common til 2030.

IMO, if it were just the efficiency gains on the table (which are substantial - ~20-30% over AV1), I'd say that AV2 isn't worth it. The biggest thing it does add though is multi-stream support, which will be a big win for VR and live sports. The other fun thing is you can send an alpha channel as a separate stream, which the file will then composite for proper transparent video support.

Based on AV1's trajectory, hardware encode isn't necessary (though it is nice). The current encoder is a reference encoder. Now that the spec is finalized, expect significant speed improvements from production encoders (realtime likely won't happen until we get it in hardware though)
Hardware encode is required if you want things like video calls, camera recording and such to use it.

It isn’t required for content distribution platforms which aren’t realtime and the cost of encode is easily made up by hundreds of thousands of streams.

One of the interesting usage of AV1 was specifically for low bitrate calls, and software encoding was perfectly fine, even on mobile.

With low enough resolution, framerate and bitrate, you can get a quality stream without significant encoding artifacts compared to any other codec. It is in production right now and has been for a while.

The tradeoff CPU / bandwidth is quite advantageous in situations like this. And no, AV1 HW encoders cannot usually be used, they are not designed for a tight bitrate control or realtime communications like software encoding is usually.

> One of the interesting usage of AV1 was specifically for low bitrate calls, and software encoding was perfectly fine, even on mobile.

You really want hardware decoding on mobile, otherwise you end up with 40 minutes battery life. Fortunately, for typical videoconference resolutions, VP8 and H.264 are just fine. AV1 is nice to have, though, due to excellent support for synthetic content (screen sharing), and for scalable video coding (a much more elegant solution than simulcast, IMHO).

In the world I live in, the general plan is to stick to VP8 and H.264 for the time being, and to skip to AV1 when it's universally available on mobile. I haven't seen any features of AV2 which would justify waiting for it.

Have you said this for Audio Codec I would have agreed. I do not know a single Smartphone Video Conferencing software that uses CPU encoding rather than hardware encoding. Neither WhatsApp or FaceTime, perhaps the largest of the two real time Video Call uses AV1.
Yeah, no production or large scale VC system is running software AV1 encoders on smartphones. You will drain a full phone battery in 1-2 hours of calls.

It just doesn’t make sense and will result in extraordinary power/battery drainage at best, or output that’s worse than hardware encoding.

The only way you could get AV1 to software encode in realtime AND low latency on a mid-range Android chip is by disabling or skipping nearly all of the compression/encoding features that make it good at low bitrate.

> Yeah, no production or large scale VC system is running software AV1 encoders on smartphones. You will drain a full phone battery in 1-2 hours of calls.

Yeah but, half jokingly, Zoom does that (draining the battery in an hour) already :P

So, status remains quo, the commons remain tragic, and glory to H.264 forever?
At least until a better codec has widespread enough hardware support, I think.
Anything running on a battery will need hardware acceleration
> The biggest thing it does add though is multi-stream support

I would have thought this would be a part of the container format rather than the video codec?

The way things are going, we can pretty much forget about AV2 hardware encoders in PCs anytime soon. All the newest, best chip capacity is being completely hogged by Apple and AI companies.

Unless chipmakers port the AV2 design to older, cheaper nodes, it’s just not happening for average users. We’ll probably see some Chinese TV chip makers throw in an AV2 decoder just to check a box, but as an actual encoder? I wouldn't count on it anytime soon.

I wouldn't be so pessimistic, Intel and AMD aren't going to stop making CPUs, and if their integrated graphics adds AV2 it will be motivation enough for others to follow.
PCs don’t need hardware encoders. There are no realtime jobs on these. You can let it encode in however long it takes.

You need hardware encoders for things like cameras because they need to encode in real time since the buffer would quickly overflow otherwise.

Use cases for hardware encoders in PCs:

- Video calls

- Screen/webcam recording

- Live streaming

- Real-time transcoding for media servers (don’t know much about this but I’ve heard it’s a thing)

- Game streaming

- Video editing (making exporting less frustrating)

Everything here is really niche, except video calls (and even that...).

In other words, unless on smart phones, don't expect broadly distributed AV2 encoding hardware.

If it does happen on PC, it will be most likely some courtesy of the hardware chip designers.

loading story #48346656
Looking at how GPU development has been sidetracked for NPU, I worry that this is 2035 target at best. Manufacturers will push for maximising matrix operation silicon area. In the era of trillion dollar investments into datacenters, traffic cost is afterthought. The only benefitors might be YouTube, Netflix and such, but on their scale investment into ISP level caches might be cheaper.
> enabling high-quality video delivery at significantly lower bitrates

> likely will remain so til ~2028 when the first av2 hardware accelerated chips start dropping

This might sound dumb, but whats the point if its intended for slower devices, but those slower devices don't even exist yet?

It's not for slower devices, it's for lower data transfer bills for providers like YouTube.
It's a win for consumers as well if you can get better video quality or more reliable calls on a slower connection.
so that new devices can adopt it

They can't adopt it if it doesn't exist.

And what about old devices? I'm sure someone out there is still using an s5 as a daily driver... Future proofing is great and all, but 240p on modern devices looks like trash, even worse than tube tv.
240p for someone’s webcam thumbnail in Microsoft teams is perfectly sufficient though.
They can keep using whatever they're currently using.

This doesn't take away anything. It's a new standard.

Based on your argument, one should add new safety standards to cars like seat belts, because old cars might not have them.

one shouldn't*

One shouldn't add new safety standards*

writing before coffee...

To be fair, some municipalities do actually require old cars to be fitted with seatbelts... air bags, not so much, because apparently changing a steering column is too hard?
Well, I just noticed I said 'should' and not 'shouldn't' which I wanted to say

But to your point,

While they might not be required to retrofit, one shouldn't stop defining new safety standards.

In HW accelerated chips, what part of the calculations they usually accelerate? Could it be possible to repurpose old HW?
Essentially all of the processing of the video data, barring the container format which the CPU uses to know what part of the data to send to the GPU or the Audio chip or codec.

And HW acceleration is generally a preset baked in version of the encoder or decoder. These are mostly codec specific.

So, no using hardware from previous versions.

Now, you can see some software that tries to use the GPU itself, instead of the dedicated hardware acceleration, to decode, but that isn't the HW accelerated, and may not operate in real time.

At the same time, that will consume much more power, eliminating some of the advantages or the pure HW rendition, especially important for mobile.

I could see an argument being made for encoding, if it is 2x or faster than the CPU, but I haven't looked at any in a while, so don't know the speeds.

One of the biggest gains of having dedicated hardware is that the computation doesn’t happen on the general hardware.

This is what makes it viable on mobile devices where system responsiveness and power efficiency are high priority.

Generally these hardware decoders haven’t been retoolalble.

Typically things like quantization and motion estimation.
I feel like in 2030 it's more likely that we send 480p and just upscale with ai on the other end
480p ? More like 200i. There's a race to the bottom driven by those "up to" codecs.
> The biggest thing it does add though is multi-stream support

This was supported in H.264 MVC but only saw real use for 3D movies on physical BluRays. With almost no content available outside that.

That's fine and not anything new for codecs, they always take a long time before mass adoption.

Take a look at AV1 itself, you can't even say it's really ubiquitous on all hardware. It's quite well along in adoption compared to early days, but some mobile devices are still lacking hardware acceleration for it.

It is still ironic to me that my Steam Deck has decode AV1 acceleration, on a really old CPU/GPU combo.
AMD added AV1 encoding only in later SoCs though. Next Steam Deck will have both.
>........I'd say that AV2 isn't worth it.

Unless they have hardware encoder and decoder design done in parallel, otherwise it would be 2028 before a hardware block design is done and 2030 for the earliest product to ship with it. In reality I think 2031 or 2032 is more likely.

And I have been saying the same for quite some time that 20-30% for a generational codec improvement isn't worth it. I think they originally aimed at 50%, and then 40% and then 30%.

Where do you see information about the efficiency gains over AV1?
I feel like these encoders that require acceleration are a big reason why good hardware obsoletes so quickly.
> The biggest thing it does add though is multi-stream support, which will be a big win for VR and live sports.

AV1 supports it too ?

Does anyone know what is so costly to calculate in AV2?
In general they just increase the numbers on everything when they go up a generation.

e.g. if you check in 4 directions to see if you can reuse a chunk then make it check in 8 or 16.

Faster encoders will have smart heuristics on when to use these new abilities and when to skip them but the reference encoder will try everything in a dumb way to eke out a tiny win to maximize a theoretical advantage and map out the extreme best case.

What I'm interested in is seeing how this will improve the AVIF image format. AVIF stomps the competition for low-bitrate still images (where chroma subsampling is used). For lossless images, not so much. Lossless JPEG XL and lossless WEBP make lossless AVIF look like a joke.
loading story #48343793
loading story #48344928
loading story #48343450
AV1 is being actively claim-charted by a lot of companies right now, and lawsuits are almost certainly coming. The same process is already starting for AV2, but most players are waiting for the AV1 cases to mature first.

People keep calling the AV-family codecs “royalty free,” but in practice they increasingly look like a legal and financial gamble.

People have been saying this for decades now.

I've never understood why some people seem to cheer this on like a corporation owning some maths was their local sports team.

For a while I assumed some people had put in a lot of effort on H.264 encoders and so the digital sharecroppers were angry and jealous that someone might be advocating for messy freedom.

But some people seem to just enjoy the thought of corporations putting a tax on video distribution.

Luckily those greedy corporations have repeatedly shot themselves on the foot and so their influence is waning.

How long has it been since AV1 was released? About eight years, and there's still no credible patent holder. The vultures are always circling around compression standards. You shouldn't take that too seriously. Even if a lawsuit is filed, there's a legal defence fund to protect against baseless claims.
Most importantly, there is no alternative. As kasabali said, the patent situation for the other codecs is a mess. Additionally, they could be hit by the same "No FRAND" problem. If someone comes forward with a patent that was used unintentionally, the situation is the same for all codecs.
> People keep calling the AV-family codecs “royalty free,” but in practice they increasingly look like a legal and financial gamble.

And the alternative is… ?

For H.265 there are two HEVC licensing pools you have to sign with plus at least two non-pool companies:

* https://en.wikipedia.org/wiki/High_Efficiency_Video_Coding#P...

Going with a non-AVx codec is no less complicated and fraught with lawsuit risk AFAICT.

> in practice they increasingly look like a legal and financial gamble

As opposed to what, like HEVC? Where you need to pay 3 different patent pools to be sure (which all has different terms), then there's still other patent holders that aren't in any pools and come and hit you with loyalty requests any time under terms however they like to?

It should be not possible to patent communication standards. The opportunity for abuse through lock-in effects is just too big.
{"deleted":true,"id":48344909,"parent":48343639,"time":1780227219,"type":"comment"}
And how long will it take before someone implements this standard and gets sued because Adobe or Dolby or whoever wanted to get slapped down? My knowledge may be out of date but if this is as "open" as AV1, I'm very skeptical that the individual companies will actually allow that. Greed and all that.
loading story #48343425
loading story #48344258
Mostly a joke... I've been waiting for the AV1 Apple TV, so now I'm just waiting for AV2 support as Apple TV as well now.
My 10 year old iPhone 7 can play 1080p AV1 video in software for more than 200 minutes with VLC. The iPhone 7 was released a year and a half before AV1 was.

So I think it's a safe bet the current Apple TV devices are capable of playing AV1 video in software. There's a VLC release for Apple TV:

https://www.videolan.org/vlc/download-appletv.html

https://apps.apple.com/us/app/vlc-media-player/id650377962?p...

Not especially relevant, as the obvious use of AV1 on the AppleTV is streaming, and the OS frameworks don't request AV1 without hardware decoding. Services which provide their own video decoding (are there any?) don't seem interested providing their own software decoder for the ATV, despite the bandwidth savings.
Six years ago Apple TV added support for playing 4K YouTube video:

https://9to5mac.com/2020/06/22/tvos-14-brings-support-for-st...

YouTube's 4K videos are only available in VP9 and AV1.

So the YouTube app on tvOS has supported at least VP9 for six years and I wouldn't be surprised if it supports AV1 today.

Apple A17 Pro / A18 include AV1 hardware decode.
The latest Apple TV is on the A15:

https://en.wikipedia.org/wiki/Apple_TV_(device)#4K_(3rd_gene...

There may be a new Apple TV released this year.

> There may be a new Apple TV released this year.

The evergreen prediction in the Apple TV world :p. IIRC, Mark Gruman initially predicted the next model would be out H1 of 2024

Outside the apple ecosystem, AV1 is supported nearly everywhere.
They’ve had hardware decoding since M3 and equivalent A cpus. So I’d say it’s pretty well supported.
Neat but the real world still, and for decades to come - lives on h264 https://www.wink.co/documentation/Why-H264-Is-Almost-Always-...
loading story #48346311
{"deleted":true,"id":48345898,"parent":48340910,"time":1780237421,"type":"comment"}
I'm not a expert on video encoding

But i wonder if the future could depend less on fixed-function compression methods and more on AI networks that recreate the video but weight much less that a compressed video.

Neural codecs such as github.com/Orange-OpenSource/Cool-Chic

It will probably depend on whether NPUs are universally available in smartphones, and whether we get a standard API for accessing NPUs. But I don't know whether AI-based codecs can have battery usage competitive with fixed-function hardware.
AV1 already was a big leap toward efficient and open video formats. I'm awaiting AV2 since a long time.

Sure it'll take a while since it's implemented in chips and hardware so we got efficient and fast hardware encoding/decoding.

But a ~25% higher efficiency sounds very promising in times of increasing storage prices and chip crises.

Dav2d doesn't have the same nice ring to it. I hope there's someone with a decent repo-name punning skill who'll contribute before that.

avi2ude? av2go?

It was difficult to find a nice name, with av2 :(

It works in French d2vid (Deuvid)

At least it's not D4vd.
daviid - or trim to davii and pronounce it "davey". But tbh I quite like dav2d.
I like it - not as punny as the first but pretty straightforward.
Looking forward to a decently speedy encoder coming around. The reference one for AV1 is really not that great, and the same is true here. But as soon as we get SVT-AV2 or whatever, I'll be a very happy camper.
anyone performed a encoding and decoding benchmarking with the reference codec yet? I'd expect encoding to be dreadful, but maybe the decode is already usable
Is there an AV1 2.0? I'm not using this codec if they can't do basic semantic versioning right.
It takes a few years for vendors to support hardware decoding for a new standard, so we won't see it in widespread use anytime soon.
Congrats!

How is the case of fighting off Dolby's patent racketeering going? They tried to attack Snapchat for using AV1.

loading story #48342862
loading story #48344681
loading story #48342872
Now the real question: will The Industry™ again need for nerds and digital privateers to actually write and graft all the psy coding tools that makes their encoders usable (and I mean the word, using x264 as benchmark) for the quality-conscious and not just CPU-efficient VoD blurred to death with marketing yelling "PSNR! PSNR! PSNR!" at the top of their lungs? Will FGS be more usable?

Take a look at https://gitlab.com/AOMediaCodec/SVT-AV1/-/work_items/2269 for details (PS: SVT-AV1 claims to be suitable for AVIF yet doesn't support YUV444, lel)

I’m curious how much AV2 will actually help older hardware in practice.

I’m on a 2019 Intel MacBook Pro: 2.6 GHz 6-core i7, 64 GB RAM. The machine is still more than powerful enough for normal desktop work and software dev, but YouTube in Chrome has become borderline unusable for me. My internet is fine, Safari plays the same videos smoothly, and YouTube “Stats for nerds” shows plenty of buffer but the decoding makes youtube unusable in chrome for me.

Has nothing to do with video codec.

Download the video with yt-dlp & play it in mpv you'll see it even flies on a potato.

Play the same in browser and it'll be dropping frames left and right.

I use Firefox but YouTube has recently started giving me a pop-up occasionally telling me that they are intentionally slowing down the site because they don't like some of the browser extensions I use.
Sound like a Chrome/Youtube problem. My 2012 Macbook Pro plays 1080p AV1 just fine in VLC (pretty sure Youtube works fine too in Firefox, but I didn't check whether or not it was AV1 or H264).
For reference: dav1d 0.5 can decode 143 FPS of a 1080p 8-bit video on a third gen core i7.[1] I doubt there's been much in the way of regressions since then. 10-bit and 4k is obviously a lot more heavy, but not really relevant to older devices.

[1] https://www.phoronix.com/news/dav1d-0.5 (it's mislabled as a core i3, but the 3770K is a core i7).

use the enhanced h264-ify to block the av1 stream, av1 takes a lot of cpu
Unfortunately for you, newer codecs use more CPU than older ones so AV2 would probably be even worse.
Newer codecs has generally introduced more ways a video can be encoded, so that the encoder need to work much harder to encode a video, so that it actually achieve the gains that the newer codec allows much more processing will be required. Decoding on the other hand, will mostly stay the same or increase only slightly. It's not likely do decrease though, so if you struggle playing av1 today, you will also struggle with av2.

For encoding, you can always write a simple encoder that use only the features that were present in mpeg2, and it will be about as efficient as mpeg2 as well. Newer codecs has more features that allows more efficient encoding, at the cost of more processing.

I gave up using Chrome a decade ago. It’s a power sucking pig. Safari has its own issues, but at least it’s usable. When I need something that isn’t Safari, I use Firefox.
Safari has a terrible developer experience and has been behind in implementing the various browser API for years, including AV1 support.
Safari has had AV1 support for a long time. Absolute majority of "various browser API" that they don't support are all kinds of "Web Bluetooth"-like crap that I don't need and which only introduces attack and tracking surface when supported.