Welcome to Doom9's Forum, THE in-place to be for everyone interested in DVD conversion.

Before you start posting please read the forum rules. By posting to this forum you agree to abide by the rules.

 

Go Back   Doom9's Forum > Video Encoding > New and alternative video codecs

Reply
 
Thread Tools Search this Thread Display Modes
Old 4th September 2023, 20:09   #1  |  Link
otfuttr
Registered User
 
Join Date: Aug 2023
Posts: 5
Variable resolution codec?

Codecs have evolved to be able to vary the bitrate, and frame rate... why not also vary the resolution?

Netflix has already been doing this for a while now with convex hull encoding: https://netflixtechblog.com/per-titl...n-7e99442b62a2 They encode a single video normally at multiple resolutions, then switch between them with their custom solution choosing the optimal resolution at a given bitrate, but it's not built into the codec itself. On the web, you can use HLS or DASH, but afaik, there is no codec or container that has native support for this kind of behavior. Considering the benefits have been known for a while now, isn't it time to start adopting this into codecs?

I know AV1 has a primitive form of this called super resolution, but it is limited to 2x downsampling and it's not really enabled by default like a standard feature. I'm talking about fully arbitrary scaling, from 240p all the way up to 4K, plus everything in between, as a core feature. A new paradigm for codecs, putting the days of picking a resolution in the past. You just choose a quality level, and the encoder figures out the best resolution+bitrate pair on the convex hull, not just for the video, but for each scene and even individual frames.

Has anyone experimented with something like this? If not, why? Are gains minimal/not worth the effort? Too computationally demanding? Are there patent issues? Thoughts?
otfuttr is offline   Reply With Quote
Old 4th September 2023, 22:25   #2  |  Link
nevcairiel
Registered Developer
 
Join Date: Mar 2010
Location: Hamburg/Germany
Posts: 10,344
The benefit from such an idea is not quite as large as it may seem, because you can just drop some detail or encode with larger blocks and most benefit from reducing the resolution disappears. Its not like modern codecs literally encode every and each pixel on their own.
__________________
LAV Filters - open source ffmpeg based media splitter and decoders
nevcairiel is offline   Reply With Quote
Old 5th September 2023, 04:52   #3  |  Link
rwill
Registered User
 
Join Date: Dec 2013
Posts: 343
This is spatial scalability more or less. Most video standards since Mpeg-2 have supported such a thing.

It somehow never took off for Mpeg-2, AVC and HEVC. Maybe because the coding efficiency losses are too high or there was no demand by the market. We will see how it goes with VVC.
rwill is offline   Reply With Quote
Old 5th September 2023, 09:09   #4  |  Link
birdie
Artem S. Tashkinov
 
birdie's Avatar
 
Join Date: Dec 2006
Posts: 337
Almost all online video delivery websites already do that. You don't need a new codec for that, any existing one will work. Depending on the bandwidth you're getting the best resolution for it.

While this is fine for such websites since they have massive storage and money, this will not work for most end users who have limited storage and it doesn't make a lot of sense either: CRF and target bitrate already take care of that.
birdie is offline   Reply With Quote
Old 5th September 2023, 18:55   #5  |  Link
benwaggoner
Moderator
 
Join Date: Jan 2006
Location: Portland, OR
Posts: 4,752
Quote:
Originally Posted by birdie View Post
Almost all online video delivery websites already do that. You don't need a new codec for that, any existing one will work. Depending on the bandwidth you're getting the best resolution for it.
Because sometimes you can do 1080p anime at 1 Mbps, but 8 Mbps isn't enough for an intense action scene. The idea is that varying frame size will allow perceptual quality to be more constant at a given bitrate.

Quote:
While this is fine for such websites since they have massive storage and money, this will not work for most end users who have limited storage and it doesn't make a lot of sense either: CRF and target bitrate already take care of that.
The same idea can be helpful for small sites as well. If the limitation is a particular file size/bitrate, figuring out what the right frame size is the best bang for the bit can help out a lot. Particularly if there is a lot of variation in content complexity.
__________________
Ben Waggoner
Principal Video Specialist, Amazon Prime Video

My Compression Book
benwaggoner is offline   Reply With Quote
Old 6th September 2023, 02:07   #6  |  Link
otfuttr
Registered User
 
Join Date: Aug 2023
Posts: 5
Quote:
Originally Posted by nevcairiel View Post
you can just drop some detail or encode with larger blocks
Good point. You can kind of think of blocks like intra-frame variable resolution, then increasing block size seems equivalent to reducing resolution, but you'd have to increase block size a ton. e.g. going from 480p to 2160p would require a 4.5x increase in block size to achieve equivalent coverage of the frame. Might cause performance issues.
otfuttr is offline   Reply With Quote
Old 6th September 2023, 13:14   #7  |  Link
Selur
Registered User
 
Selur's Avatar
 
Join Date: Oct 2001
Location: Germany
Posts: 7,259
Quote:
why not also vary the resolution?
Don't vpx and av1 encoders support spatial resampling? (At least the ones vpxenc and aomenc should support this,..)
Quote:
Spatial resampling involves scaling the image down to a smaller size in the encoder (as an alternative method for reducing the number of bits per frame to increasing the quantizer) and then scaling it back up in the decoder. Note that frames can be dropped at any time but the encoder can only change its spatial re-sampling ratio on a key frame.
see for example: https://www.webmproject.org/docs/encoder-parameters/
__________________
Hybrid here in the forum, homepage
Selur is offline   Reply With Quote
Old 6th September 2023, 20:11   #8  |  Link
benwaggoner
Moderator
 
Join Date: Jan 2006
Location: Portland, OR
Posts: 4,752
Quote:
Originally Posted by Selur View Post
Don't vpx and av1 encoders support spatial resampling? (At least the ones vpxenc and aomenc should support this,..)

see for example: https://www.webmproject.org/docs/encoder-parameters/
Yeah. It allows resolution adaptability at the stream level instead of at the player heuristics level, at the cost of some flexibility.

Basically, if there's a really hard sequence that would look bad at the current bitrate and resolution, scaling down acts as a sort of "emergency blur" to quadruple bits per pixel, and so push QPs way down.

This could be very powerful combined with Film Grain Synthesis, as the grain itself would still be rendered at full resolution. There are plenty of 35mm films, particularly Super35, which are essentially 720p actual detail with a 4K film grain layer on top anyway .

Alas, real-world FGS implementations have shipped with bugs that have prevented broad use of FGS with AV1. Perhaps with AV2, or as a generalized FGS filter that could be triggered by metadata for any codec.
__________________
Ben Waggoner
Principal Video Specialist, Amazon Prime Video

My Compression Book
benwaggoner is offline   Reply With Quote
Old 6th September 2023, 23:32   #9  |  Link
Blue_MiSfit
Derek Prestegard IRL
 
Blue_MiSfit's Avatar
 
Join Date: Nov 2003
Location: Los Angeles
Posts: 5,988
This is making me think of LC-EVC - where you'd often encode at 1/4 or 1/2 resolution to keep the QP low, and then let their enhancement layer reconstruct high frequencies during upscaling. HE-AAC's SBR for video, basically
__________________
These are all my personal statements, not those of my employer :)
Blue_MiSfit is offline   Reply With Quote
Old 7th September 2023, 18:29   #10  |  Link
benwaggoner
Moderator
 
Join Date: Jan 2006
Location: Portland, OR
Posts: 4,752
Quote:
Originally Posted by Blue_MiSfit View Post
This is making me think of LC-EVC - where you'd often encode at 1/4 or 1/2 resolution to keep the QP low, and then let their enhancement layer reconstruct high frequencies during upscaling. HE-AAC's SBR for video, basically
Yeah, it's the same sort of thing, another kind of out-of-loop post processing.

There could be some value in having the scaling be in-loop, so a frame could be predicted from a scaled frame of a different resolution.
__________________
Ben Waggoner
Principal Video Specialist, Amazon Prime Video

My Compression Book
benwaggoner is offline   Reply With Quote
Old 8th September 2023, 01:36   #11  |  Link
otfuttr
Registered User
 
Join Date: Aug 2023
Posts: 5
Quote:
Originally Posted by Selur View Post
Don't vpx and av1 encoders support spatial resampling? (At least the ones vpxenc and aomenc should support this,..)

see for example: https://www.webmproject.org/docs/encoder-parameters/
I was aware of AV1 support, but I was not aware vpx supported this. Is it fully arbitrary scaling? or only limited range? I can't find any details or it even being mentioned in the specs here: https://storage.googleapis.com/downl...0331-draft.pdf

AV1 is limited to 2x downsampling, but there is a lot of steps in between, allowing integer ratios of 8/9 down to 8/16 source: https://gitlab.com/AOMediaCodec/SVT-...-Resolution.md
otfuttr is offline   Reply With Quote
Old 8th September 2023, 02:19   #12  |  Link
otfuttr
Registered User
 
Join Date: Aug 2023
Posts: 5
Quote:
Originally Posted by benwaggoner View Post
There could be some value in having the scaling be in-loop, so a frame could be predicted from a scaled frame of a different resolution.
Yea, I was thinking of exactly something like this.

For example in h264, you normally have B-frames encoded at much higher QP than I-frames. For sufficiently high QP, you may be removing most of the high frequency detail anyways, so you are effectively decreasing the resolution. But say you effectively decrease resolution by 2x, but since max macroblock size is 16x16, you're effectively limiting yourself to 8x8 macroblocks, and so you enter a sub-optimal region on the convex-hull.

So what if instead of increasing QP a lot, you reference an I-frame encoded at full resolution, but encode the B-frame at 75% resolution but at lower QP. If you find the optimal point on the convex hull for all B-frames, could there be significant efficiency gains?

Last edited by otfuttr; 8th September 2023 at 02:25.
otfuttr is offline   Reply With Quote
Old 8th September 2023, 05:59   #13  |  Link
benwaggoner
Moderator
 
Join Date: Jan 2006
Location: Portland, OR
Posts: 4,752
Quote:
Originally Posted by otfuttr View Post
Yea, I was thinking of exactly something like this.

For example in h264, you normally have B-frames encoded at much higher QP than I-frames. For sufficiently high QP, you may be removing most of the high frequency detail anyways, so you are effectively decreasing the resolution. But say you effectively decrease resolution by 2x, but since max macroblock size is 16x16, you're effectively limiting yourself to 8x8 macroblocks, and so you enter a sub-optimal region on the convex-hull.

So what if instead of increasing QP a lot, you reference an I-frame encoded at full resolution, but encode the B-frame at 75% resolution but at lower QP. If you find the optimal point on the convex hull for all B-frames, could there be significant efficiency gains?
Yeah, something like that could be interesting.

It would have been a slam-dunk feature for something like MPEG-2 or VC-1, which had really steep quality drop-offs as QP got too high.

However, the trend in codecs is better and better in-loop prediction for error concealment. In most cases, HEVC and AV1 aren't going to look much worse at 1080 at the same bitrate. Since high QPs don't get sharp block edges, but get kinda soft, it's not all that different from what lowering resolution gives you.

VVC does even better in this regard, with much more natural looking motion edges with high QP predicted TUs. So I think for the most part we're solving the problem in other ways.

The key thing for a streaming service is to know that you can use higher resolutions at a given bitrate for some classes of content. Going smaller isn't nearly as valuable.
__________________
Ben Waggoner
Principal Video Specialist, Amazon Prime Video

My Compression Book
benwaggoner is offline   Reply With Quote
Old 12th September 2023, 08:02   #14  |  Link
Wiabol
Registered User
 
Join Date: Aug 2021
Posts: 2
https://bitmovin.com/vvc-open-gop-resolution-switching/
Wiabol is offline   Reply With Quote
Old 12th September 2023, 19:04   #15  |  Link
benwaggoner
Moderator
 
Join Date: Jan 2006
Location: Portland, OR
Posts: 4,752
Awesome! I'll make sure to ask more about this in my meeting with Bitmovin at IBC.
__________________
Ben Waggoner
Principal Video Specialist, Amazon Prime Video

My Compression Book
benwaggoner is offline   Reply With Quote
Old 14th September 2023, 22:36   #16  |  Link
otfuttr
Registered User
 
Join Date: Aug 2023
Posts: 5
Quote:
Originally Posted by Wiabol View Post
Interesting. Didn't realize VVC already had this. Will need to look into the details some more.
otfuttr is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +1. The time now is 07:19.


Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2024, vBulletin Solutions Inc.