Welcome to Doom9's Forum, THE in-place to be for everyone interested in DVD conversion.

Before you start posting please read the forum rules. By posting to this forum you agree to abide by the rules.

 

Go Back   Doom9's Forum > Video Encoding > New and alternative video codecs
Register FAQ Calendar Today's Posts Search

Reply
 
Thread Tools Search this Thread Display Modes
Old 21st April 2014, 10:56   #1  |  Link
kidmany2001
cookieman
 
kidmany2001's Avatar
 
Join Date: Mar 2014
Posts: 6
Do VP9 have B-frame or P-frame ?

Hi eveybody,
After I read the WebM(VP9) official site.
I knew VP9's encode can be set the period between I-frames.
But it didn't say what are the type of frame between the I-frame.
I ask an expert. He told me there's no coding structure like HEVC with hierarchical-B .

I wonder in VP9 encoding ,
what are the kind of frames between the two different I-frame.

It's just nornal reference frame? Or VP9 still use B-frame and P-frame ?
Can I encode the sequence with IBBBBB as HEVC's random access test condition ?

I am quit confused ? Who can teach me?

thanks.
kidmany2001 is offline   Reply With Quote
Old 21st April 2014, 11:10   #2  |  Link
nevcairiel
Registered Developer
 
Join Date: Mar 2010
Location: Hamburg/Germany
Posts: 10,347
VP9 uses only P-frames, it does not have B-frames.

However, in addition, it supports a few additional features to manage reference frames, called AltRef and Golden frames.
__________________
LAV Filters - open source ffmpeg based media splitter and decoders
nevcairiel is offline   Reply With Quote
Old 21st April 2014, 12:51   #3  |  Link
kidmany2001
cookieman
 
kidmany2001's Avatar
 
Join Date: Mar 2014
Posts: 6
thanks for the answer.

Is there any document about the VP9's p-frame.?

I want to read about these.
kidmany2001 is offline   Reply With Quote
Old 21st April 2014, 12:58   #4  |  Link
mandarinka
Registered User
 
mandarinka's Avatar
 
Join Date: Jan 2007
Posts: 729
Check this thread for a description, mainly the "Reference frame management" and "Inter prediction" sections: http://forum.doom9.org/showthread.ph...86#post1647086

From that it seems to me that altref and golden frames are just names now and the real scheme works differently this time (closer to H.264's multiple referencing scheme, but with some comlications). And there is some thing - obviously named differently - that mimicks B-frames (compound prediction).

Quote:
Originally Posted by pieter3d View Post
However, VP9 does support “compound prediction”, which really is just another word for bi-prediction where there are two motion vectors for each block and the two resulting prediction samples are averaged together. In order to avoid patents on bi-prediction, compound prediction is only enabled in frames that are marked as not-displayable. A frame like this is never output for display, but may be used for reference later. In fact, a later frame may consist of nothing but 64x64 blocks with no residuals and 0,0 motion vectors that point to this non-displayed frame, effectively causing it to be output later using very little data.
So maybe the answer is "yes it has bframes but don't tell anybody, we want to pretend it is not borrowing from MPEG at all" ? Well, that might be a bit on a provocative side, but that's how it looks to me.

Last edited by mandarinka; 21st April 2014 at 13:00.
mandarinka is offline   Reply With Quote
Old 21st April 2014, 13:01   #5  |  Link
nevcairiel
Registered Developer
 
Join Date: Mar 2010
Location: Hamburg/Germany
Posts: 10,347
At the very least it doesn't have frame-reordering typically caused by B-Frames, because they wanted to keep that complexity out. Instead it has ref frames which are not displayed and discarded by the decoder.. oh well.
__________________
LAV Filters - open source ffmpeg based media splitter and decoders
nevcairiel is offline   Reply With Quote
Old 21st April 2014, 20:23   #6  |  Link
benwaggoner
Moderator
 
Join Date: Jan 2006
Location: Portland, OR
Posts: 4,770
Quote:
Originally Posted by nevcairiel View Post
At the very least it doesn't have frame-reordering typically caused by B-Frames, because they wanted to keep that complexity out. Instead it has ref frames which are not displayed and discarded by the decoder.. oh well.
Do they have any encoder that's doing all the stuff?

One of the great features of B-frames is that they are skippable frames when seeking during random access, which makes things a lot faster. If there was an encoder that could make "skippable" frames and flag them in a way that a decoder could know what isn't needed to be decoded, that could make for much better random access with VP9.

Given that multithreaded decode of each VP9 frame doesn't look that feasible, having to decode each frame serially would be seriously painful. Imagine trying to get to frame 235 in a 240 frame GOP using an ARM software decoder...
__________________
Ben Waggoner
Principal Video Specialist, Amazon Prime Video

My Compression Book
benwaggoner is offline   Reply With Quote
Old 23rd April 2014, 13:15   #7  |  Link
kidmany2001
cookieman
 
kidmany2001's Avatar
 
Join Date: Mar 2014
Posts: 6
Thanks for the response.
Before I jump to the conclusion about this issue.
First, I need to clear some questions about temporal prediction structure of VP9 .

the question :
1. The typical test scenarios in HEVC are : low dealy P , low delay B and random access ,
can we enumlate those scenarios (intra prediction) of HEVC as possible as we can in VP9?
Meanwhile, which parameter should be set to control the temporal prediction structure ?
( I think the parameter lag-in-frames, arnr-maxframe and arnr-strength
are what we need to notice.
Does anyone have the information of these part ? )


2. Since we already know VP9's compound mode is similar to b-frames.
Does VP9 still have the concept of P-frame ?
What exactly the temporal prediction structure VP9 is ? normal ? hierarchical ?

3. Can we set the temporal prediction of VP9 to encode IBBBBBB or IPPPPPPP individually?
The kf-max-dist can set I-frame part. But can we decide the frame type as P-frame or B-frame ?

Parameter guide is poor to explain the implementation.

More detailed exploration is welcome .

Hope more discussions will help us to eliminate the doubt.

thanks
kidmany2001 is offline   Reply With Quote
Old 2nd May 2014, 16:52   #8  |  Link
pieter3d
Registered User
 
Join Date: Jan 2013
Location: Santa Clara CA
Posts: 114
With the flexible 8-entry reference pool, you can setup B-pyramids for trick mode / fast forwarding. It all depends on your structure. And with hidden frames you can emulate the reordering behavior pretty closely. Compound aka bipred is enabled at the frame level, using a header flag. One of the requirements is that one or two of the 3 reference frames (last, gold, altref) are marked as future frames, suggesting that their content be in the future.

Note that in H.264 and HEVC, frames are not really marked P and B, it can change per slice. Also, B slices can still code blocks that are P-like, and there are no constraints on where the two reference frames are in time for bipred blocks.
pieter3d is offline   Reply With Quote
Old 2nd May 2014, 18:40   #9  |  Link
benwaggoner
Moderator
 
Join Date: Jan 2006
Location: Portland, OR
Posts: 4,770
Quote:
Originally Posted by pieter3d View Post
With the flexible 8-entry reference pool, you can setup B-pyramids for trick mode / fast forwarding. It all depends on your structure. And with hidden frames you can emulate the reordering behavior pretty closely. Compound aka bipred is enabled at the frame level, using a header flag. One of the requirements is that one or two of the 3 reference frames (last, gold, altref) are marked as future frames, suggesting that their content be in the future.
That sounds promising. Is there some sort of frame header declaration about this stuff that a decoder could use to easily determine what frames are skippable?

Quote:
Note that in H.264 and HEVC, frames are not really marked P and B, it can change per slice. Also, B slices can still code blocks that are P-like, and there are no constraints on where the two reference frames are in time for bipred blocks.
I'm assuming each frame is a single slice and no tiling for H.264 and HEVC for typical VOD use cases (random access doesn't matter so much in live). Would not having mixed slice types in a single frame have much of a potential impact on compression efficiency? Certainly with H.264 we got out of the habit of using slicing outside of Blu-ray and very low latency real-time encoding due to the compression efficiency hit.

WPP in HEVC is a far superior replacement to slices for multithreaded decoding, but I haven't really thought through how slices might be useful for other scenarios in HEVC.
__________________
Ben Waggoner
Principal Video Specialist, Amazon Prime Video

My Compression Book
benwaggoner is offline   Reply With Quote
Old 2nd May 2014, 19:07   #10  |  Link
pieter3d
Registered User
 
Join Date: Jan 2013
Location: Santa Clara CA
Posts: 114
Regarding hierarchical structures: I think you just have to buffer a handful of compressed frames, decode their uncompressed headers (very quick) and infer the reference structure. Then you can know what frames can be dropped out. As of today there isn't any kind of metadata to indicate this (e.g. temporal layer id), but something may yet be added to the container.

Slices, and especially dependent slices + WPP are very useful for for low latency video transmission (conferencing). This way you can transmit a picture row by row, using full slice encapsulation, the NAL unit.
pieter3d is offline   Reply With Quote
Old 2nd May 2014, 20:54   #11  |  Link
benwaggoner
Moderator
 
Join Date: Jan 2006
Location: Portland, OR
Posts: 4,770
Quote:
Originally Posted by pieter3d View Post
Regarding hierarchical structures: I think you just have to buffer a handful of compressed frames, decode their uncompressed headers (very quick) and infer the reference structure. Then you can know what frames can be dropped out. As of today there isn't any kind of metadata to indicate this (e.g. temporal layer id), but something may yet be added to the container.
That could work, although it'd be a pain for streaming scenarios where you'd like to know what you could do with frames that haven't downloaded yet. A real moof would be nice .

Quote:
Slices, and especially dependent slices + WPP are very useful for for low latency video transmission (conferencing). This way you can transmit a picture row by row, using full slice encapsulation, the NAL unit.
Yes, absolutely. But that's not a random access scenario anyway. The VPx codecs have always been less disadvantaged for videoconferencing type scenarios.

I feel blessed that I can mainly focus on non-realtime file-to-file encoding and VOD delivery these days. So many other problems don't apply so I can focus more on the (somehow still infinite) number of problems still remaining...
__________________
Ben Waggoner
Principal Video Specialist, Amazon Prime Video

My Compression Book
benwaggoner is offline   Reply With Quote
Reply


Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +1. The time now is 08:56.


Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2024, vBulletin Solutions Inc.