How the H.264 quantization is designed?

xxthink · 4th April 2009, 01:04

Would someone explain the principle of the H.264 quantization design?
I can't understand why these quantization step sizes are choosen in the H.264.

xxthink · 4th April 2009, 02:10

And who can understand why A(Q)*B(Q)*(G^2) = 2^(N+L) as show in page 17 of the pdf file in the ulr.http://csie.ntut.edu.tw/labvsp/Chine....264%20AVC.ppt

Manao · 4th April 2009, 02:19

Have a look at that PDF

xxthink · 4th April 2009, 02:30

Quote:

Originally Posted by Manao

Have a look at that PDF

I have read this pdf file. I have some question on this file.
First why the scale constant for the quantization is 2^15 (equation 1)? and why the inverse quantization scale factor is 2^6 (equation 2)?
Second why the ratio between successive quantization step size is 1.2246...?

akupenguin · 4th April 2009, 07:26

Quote:

Originally Posted by xxthink

First why the scale constant for the quantization is 2^15 (equation 1)? and why the inverse quantization scale factor is 2^6 (equation 2)?

There is no "equation 1" or "equation 2" in that pdf.

Quote:

Second why the ratio between successive quantization step size is 1.2246...?

The ratio is 1.12246 = 2^(1/6).
There's nothing special about 6, but the reason to pick a small integer root of 2 is so that a codec only needs to contain a small table of scaling factors, and can compute the rest of the quantizers with bitshifts.
(I would have picked 2^(1/8) to simplify the modulus operation too.)

xxthink · 4th April 2009, 07:57

Quote:

Originally Posted by akupenguin

There is no "equation 1" or "equation 2" in that pdf.

Here is the pdf file.http://www.vcodex.com/files/H264_4x4...aper_Apr09.pdf

Quote:

Originally Posted by akupenguin

The ratio is 1.12246 = 2^(1/6).
There's nothing special about 6, but the reason to pick a small integer root of 2 is so that a codec only needs to contain a small table of scaling factors, and can compute the rest of the quantizers with bitshifts.
(I would have picked 2^(1/8) to simplify the modulus operation too.)

I think the quantization step size should be designed according to character (pdf) of the dct coefficients. But I don't know the reason behind h.264.

akupenguin · 4th April 2009, 08:46

Quote:

Originally Posted by xxthink

http://www.vcodex.com/files/H264_4x4_transform_whitepaper_Apr09.pdf"]http://www.vcodex.com/files/H264_4x4_transform_whitepaper_Apr09.pdf

2^15 was chosen as the largest scale such that Mf fits in 16 bits. Since Mf is used only in the encoder, an implementation that can deal with 32bit math could use a larger scale for slightly more precision in quantization. But any such improvement would be tiny compared to the other things you can do to improve quantization, such as trellis.
2^6 was chosen as the largest scale such that all the intermediate values during the computation of idct fit in 16 bits.

Quote:

I think the quantization step size should be designed according to character (pdf) of the dct coefficients. But I don't know the reason behind h.264.

The distribution of coef values affects the optimal assignment of dequantized levels to quantized levels, for any given QP. It does not affect the optimal ratio between two QPs. H.264 chose not to do anything fancy with the distribution of dequantized levels, because equally spaced levels can be decoded with a single integer multiply, whereas anything fancy would be slower.

xxthink · 4th April 2009, 09:54

Quote:

Originally Posted by akupenguin

There is no "equation 1" or "equation 2" in that pdf.

The ratio is 1.12246 = 2^(1/6).
There's nothing special about 6, but the reason to pick a small integer root of 2 is so that a codec only needs to contain a small table of scaling factors, and can compute the rest of the quantizers with bitshifts.
(I would have picked 2^(1/8) to simplify the modulus operation too.)

Why 6 will make the table of scaling factors? The scaling table is decided by the total quantization parameter numbers and the norm of the transform.

akupenguin · 4th April 2009, 10:09

Quote:

Originally Posted by xxthink

Why 6 will make the table of scaling factors? The scaling table is decided by the total quantization parameter numbers and the norm of the transform.

The dequantization scaling table for 4x4 transform with no CQM (keeping only the 3 different values per QP, with the understanding that they'll be rearranged into 16 for actual use) is:

Code:

QP0:  10, 13, 16
QP1:  11, 14, 18
QP2:  13, 16, 20
QP3:  14, 18, 23
QP4:  16, 20, 25
QP5:  18, 23, 29 
QP6:  20, 26, 32
QP7:  22, 28, 36
QP8:  26, 32, 40
QP9:  28, 36, 46
QP10: 32, 40, 50
QP11: 36, 46, 58 
QP12: 40, 52, 64
... up to QP51

Note that QP6 is QP0*2, and so on. So the codec can store just the first 6 rows of that table, and compute the rest as table[QP%6]<<(QP/6). (integer math, C syntax).
Now that does involve some more arithmetic, so a programmer could choose to keep the whole table, i.e. spend a little memory to save a few cpu cycles. But if the dequant function didn't involve integer powers of 2, then you wouldn't even have a choice.

xxthink · 4th April 2009, 10:25

Quote:

Originally Posted by akupenguin

The dequantization scaling table for 4x4 transform with no CQM (keeping only the 3 different values per QP, with the understanding that they'll be rearranged into 16 for actual use) is:

Code:

QP0:  10, 13, 16
QP1:  11, 14, 18
QP2:  13, 16, 20
QP3:  14, 18, 23
QP4:  16, 20, 25
QP5:  18, 23, 29 
QP6:  20, 26, 32
QP7:  22, 28, 36
QP8:  26, 32, 40
QP9:  28, 36, 46
QP10: 32, 40, 50
QP11: 36, 46, 58 
QP12: 40, 52, 64
... up to QP51

Note that QP6 is QP0*2, and so on. So the codec can store just the first 6 rows of that table, and compute the rest as table[QP%6]<<(QP/6). (integer math, C syntax).
Now that does involve some more arithmetic, so a programmer could choose to keep the whole table, i.e. spend a little memory to save a few cpu cycles. But if the dequant function didn't involve integer powers of 2, then you wouldn't even have a choice.

I agree with your explaination. But I wonder to know why JVT select 2^(1/6). Why not 2^(1/8) or others?
In the early stage of H.264, QP will range from 0 to 31. Now QP will range from 0 to 51. I don't know why.

Saturnist · 3rd April 2015, 16:33

Hi I got into DCT coefficients and I am wondering Why do we set the same values for intra/inter chroma coefficients as for inter/intra luma coefficients in h264 codec?

Like I have modified the flat matrix to this:

Quote:

INTRA4X4_LUMA =
16,16,15,14,
15,16,12,16,
16,16,16,16,
15,14,16,15

INTER4X4_LUMA =
16,16,16,16,
16,15,16,16,
16,15,16,16,
16,16,16,16

And why do we give this to chroma coefficients:

Quote:

INTRA4X4_CHROMAU =
16,16,15,14,
15,16,12,16,
16,16,16,16,
15,14,16,15

INTRA4X4_CHROMAV =
16,16,15,14,
15,16,12,16,
16,16,16,16,
15,14,16,15

INTER4X4_CHROMAU =
16,16,16,16,
16,15,16,16,
16,15,16,16,
16,16,16,16

INTER4X4_CHROMAV =
16,16,16,16,
16,15,16,16,
16,15,16,16,
16,16,16,16

Any ideas?

Asmodian · 4th April 2015, 00:47

H.264 can change the quantizer independently for luma and chroma, why would you not use the same matrix for both?

4th April 2009, 01:04	#1 \| Link
xxthink Registered User Join Date: Jan 2009 Posts: 25	How the H.264 quantization is designed? Would someone explain the principle of the H.264 quantization design? I can't understand why these quantization step sizes are choosen in the H.264.

4th April 2009, 02:19	#3 \| Link
Manao Registered User Join Date: Jan 2002 Location: France Posts: 2,856	Have a look at that PDF __________________ Masktools x86 & x64: Stable (2.0a48) AVCMatrices : Stable (1.3) Anisotool : Beta (1.0a5)

4th April 2015, 00:47	#12 \| Link
Asmodian Registered User Join Date: Feb 2002 Location: San Jose, California Posts: 4,407	H.264 can change the quantizer independently for luma and chroma, why would you not use the same matrix for both? __________________ madVR options explained

4th April 2009, 02:10	#2 \| Link
xxthink Registered User Join Date: Jan 2009 Posts: 25	And who can understand why A(Q)B(Q)(G^2) = 2^(N+L) as show in page 17 of the pdf file in the ulr.http://csie.ntut.edu.tw/labvsp/Chine....264%20AVC.ppt