Welcome to Doom9's Forum, THE in-place to be for everyone interested in DVD conversion.

Before you start posting please read the forum rules. By posting to this forum you agree to abide by the rules.

 

Go Back   Doom9's Forum > Video Encoding > MPEG-4 AVC / H.264
Register FAQ Calendar Today's Posts Search

Reply
 
Thread Tools Search this Thread Display Modes
Old 4th April 2009, 01:04   #1  |  Link
xxthink
Registered User
 
Join Date: Jan 2009
Posts: 25
How the H.264 quantization is designed?

Would someone explain the principle of the H.264 quantization design?
I can't understand why these quantization step sizes are choosen in the H.264.
xxthink is offline   Reply With Quote
Old 4th April 2009, 02:10   #2  |  Link
xxthink
Registered User
 
Join Date: Jan 2009
Posts: 25
And who can understand why A(Q)*B(Q)*(G^2) = 2^(N+L) as show in page 17 of the pdf file in the ulr.http://csie.ntut.edu.tw/labvsp/Chine....264%20AVC.ppt
xxthink is offline   Reply With Quote
Old 4th April 2009, 02:19   #3  |  Link
Manao
Registered User
 
Join Date: Jan 2002
Location: France
Posts: 2,856
Have a look at that PDF
__________________
Manao is offline   Reply With Quote
Old 4th April 2009, 02:30   #4  |  Link
xxthink
Registered User
 
Join Date: Jan 2009
Posts: 25
Quote:
Originally Posted by Manao View Post
Have a look at that PDF
I have read this pdf file. I have some question on this file.
First why the scale constant for the quantization is 2^15 (equation 1)? and why the inverse quantization scale factor is 2^6 (equation 2)?
Second why the ratio between successive quantization step size is 1.2246...?
xxthink is offline   Reply With Quote
Old 4th April 2009, 07:26   #5  |  Link
akupenguin
x264 developer
 
akupenguin's Avatar
 
Join Date: Sep 2004
Posts: 2,392
Quote:
Originally Posted by xxthink View Post
First why the scale constant for the quantization is 2^15 (equation 1)? and why the inverse quantization scale factor is 2^6 (equation 2)?
There is no "equation 1" or "equation 2" in that pdf.
Quote:
Second why the ratio between successive quantization step size is 1.2246...?
The ratio is 1.12246 = 2^(1/6).
There's nothing special about 6, but the reason to pick a small integer root of 2 is so that a codec only needs to contain a small table of scaling factors, and can compute the rest of the quantizers with bitshifts.
(I would have picked 2^(1/8) to simplify the modulus operation too.)
akupenguin is offline   Reply With Quote
Old 4th April 2009, 07:57   #6  |  Link
xxthink
Registered User
 
Join Date: Jan 2009
Posts: 25
Quote:
Originally Posted by akupenguin View Post
There is no "equation 1" or "equation 2" in that pdf.
Here is the pdf file.http://www.vcodex.com/files/H264_4x4...aper_Apr09.pdf

Quote:
Originally Posted by akupenguin View Post
The ratio is 1.12246 = 2^(1/6).
There's nothing special about 6, but the reason to pick a small integer root of 2 is so that a codec only needs to contain a small table of scaling factors, and can compute the rest of the quantizers with bitshifts.
(I would have picked 2^(1/8) to simplify the modulus operation too.)
I think the quantization step size should be designed according to character (pdf) of the dct coefficients. But I don't know the reason behind h.264.
xxthink is offline   Reply With Quote
Old 4th April 2009, 08:46   #7  |  Link
akupenguin
x264 developer
 
akupenguin's Avatar
 
Join Date: Sep 2004
Posts: 2,392
Quote:
Originally Posted by xxthink View Post
http://www.vcodex.com/files/H264_4x4_transform_whitepaper_Apr09.pdf"]http://www.vcodex.com/files/H264_4x4_transform_whitepaper_Apr09.pdf
2^15 was chosen as the largest scale such that Mf fits in 16 bits. Since Mf is used only in the encoder, an implementation that can deal with 32bit math could use a larger scale for slightly more precision in quantization. But any such improvement would be tiny compared to the other things you can do to improve quantization, such as trellis.
2^6 was chosen as the largest scale such that all the intermediate values during the computation of idct fit in 16 bits.
Quote:
I think the quantization step size should be designed according to character (pdf) of the dct coefficients. But I don't know the reason behind h.264.
The distribution of coef values affects the optimal assignment of dequantized levels to quantized levels, for any given QP. It does not affect the optimal ratio between two QPs. H.264 chose not to do anything fancy with the distribution of dequantized levels, because equally spaced levels can be decoded with a single integer multiply, whereas anything fancy would be slower.
akupenguin is offline   Reply With Quote
Old 4th April 2009, 09:54   #8  |  Link
xxthink
Registered User
 
Join Date: Jan 2009
Posts: 25
Quote:
Originally Posted by akupenguin View Post
There is no "equation 1" or "equation 2" in that pdf.

The ratio is 1.12246 = 2^(1/6).
There's nothing special about 6, but the reason to pick a small integer root of 2 is so that a codec only needs to contain a small table of scaling factors, and can compute the rest of the quantizers with bitshifts.
(I would have picked 2^(1/8) to simplify the modulus operation too.)
Why 6 will make the table of scaling factors? The scaling table is decided by the total quantization parameter numbers and the norm of the transform.

Last edited by xxthink; 5th April 2009 at 01:41.
xxthink is offline   Reply With Quote
Old 4th April 2009, 10:09   #9  |  Link
akupenguin
x264 developer
 
akupenguin's Avatar
 
Join Date: Sep 2004
Posts: 2,392
Quote:
Originally Posted by xxthink View Post
Why 6 will make the table of scaling factors? The scaling table is decided by the total quantization parameter numbers and the norm of the transform.
The dequantization scaling table for 4x4 transform with no CQM (keeping only the 3 different values per QP, with the understanding that they'll be rearranged into 16 for actual use) is:
Code:
QP0:  10, 13, 16
QP1:  11, 14, 18
QP2:  13, 16, 20
QP3:  14, 18, 23
QP4:  16, 20, 25
QP5:  18, 23, 29 
QP6:  20, 26, 32
QP7:  22, 28, 36
QP8:  26, 32, 40
QP9:  28, 36, 46
QP10: 32, 40, 50
QP11: 36, 46, 58 
QP12: 40, 52, 64
... up to QP51
Note that QP6 is QP0*2, and so on. So the codec can store just the first 6 rows of that table, and compute the rest as table[QP%6]<<(QP/6). (integer math, C syntax).
Now that does involve some more arithmetic, so a programmer could choose to keep the whole table, i.e. spend a little memory to save a few cpu cycles. But if the dequant function didn't involve integer powers of 2, then you wouldn't even have a choice.

Last edited by akupenguin; 4th April 2009 at 10:14.
akupenguin is offline   Reply With Quote
Old 4th April 2009, 10:25   #10  |  Link
xxthink
Registered User
 
Join Date: Jan 2009
Posts: 25
Quote:
Originally Posted by akupenguin View Post
The dequantization scaling table for 4x4 transform with no CQM (keeping only the 3 different values per QP, with the understanding that they'll be rearranged into 16 for actual use) is:
Code:
QP0:  10, 13, 16
QP1:  11, 14, 18
QP2:  13, 16, 20
QP3:  14, 18, 23
QP4:  16, 20, 25
QP5:  18, 23, 29 
QP6:  20, 26, 32
QP7:  22, 28, 36
QP8:  26, 32, 40
QP9:  28, 36, 46
QP10: 32, 40, 50
QP11: 36, 46, 58 
QP12: 40, 52, 64
... up to QP51
Note that QP6 is QP0*2, and so on. So the codec can store just the first 6 rows of that table, and compute the rest as table[QP%6]<<(QP/6). (integer math, C syntax).
Now that does involve some more arithmetic, so a programmer could choose to keep the whole table, i.e. spend a little memory to save a few cpu cycles. But if the dequant function didn't involve integer powers of 2, then you wouldn't even have a choice.
I agree with your explaination. But I wonder to know why JVT select 2^(1/6). Why not 2^(1/8) or others?
In the early stage of H.264, QP will range from 0 to 31. Now QP will range from 0 to 51. I don't know why.
xxthink is offline   Reply With Quote
Old 3rd April 2015, 16:33   #11  |  Link
Saturnist
Registered User
 
Join Date: Dec 2014
Posts: 2
chroma values in dct

Hi I got into DCT coefficients and I am wondering Why do we set the same values for intra/inter chroma coefficients as for inter/intra luma coefficients in h264 codec?

Like I have modified the flat matrix to this:

Quote:
INTRA4X4_LUMA =
16,16,15,14,
15,16,12,16,
16,16,16,16,
15,14,16,15

INTER4X4_LUMA =
16,16,16,16,
16,15,16,16,
16,15,16,16,
16,16,16,16
And why do we give this to chroma coefficients:

Quote:
INTRA4X4_CHROMAU =
16,16,15,14,
15,16,12,16,
16,16,16,16,
15,14,16,15

INTRA4X4_CHROMAV =
16,16,15,14,
15,16,12,16,
16,16,16,16,
15,14,16,15

INTER4X4_CHROMAU =
16,16,16,16,
16,15,16,16,
16,15,16,16,
16,16,16,16

INTER4X4_CHROMAV =
16,16,16,16,
16,15,16,16,
16,15,16,16,
16,16,16,16
Any ideas?
Saturnist is offline   Reply With Quote
Old 4th April 2015, 00:47   #12  |  Link
Asmodian
Registered User
 
Join Date: Feb 2002
Location: San Jose, California
Posts: 4,407
H.264 can change the quantizer independently for luma and chroma, why would you not use the same matrix for both?
__________________
madVR options explained
Asmodian is offline   Reply With Quote
Reply

Tags
quantization


Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +1. The time now is 17:04.


Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2024, vBulletin Solutions Inc.