Welcome to Doom9's Forum, THE in-place to be for everyone interested in DVD conversion. Before you start posting please read the forum rules. By posting to this forum you agree to abide by the rules. |
15th May 2008, 07:47 | #21 | Link | ||
DivX Team
Join Date: Oct 2001
Location: San Diego, CA
Posts: 24
|
Quote:
Quote:
The decoder is part of a bigger source tree, there was no effort to "prune" dead code for this release. You could be seeing some assembly that is not used by the H.264 decoder. For example, ASP encoder does use floating point. That should cover most instances of 'emms'. But you have a good point. |
||
15th May 2008, 07:52 | #22 | Link | ||
x264 developer
Join Date: Sep 2005
Posts: 8,666
|
Quote:
Quote:
Here's the code I found: Code:
1008962e: 55 push %ebp 1008962f: 89 e5 mov %esp,%ebp 10089631: 81 ec 08 00 00 00 sub $0x8,%esp 10089637: 89 7d f8 mov %edi,0xfffffff8(%ebp) 1008963a: 31 c9 xor %ecx,%ecx 1008963c: 8b 45 08 mov 0x8(%ebp),%eax 1008963f: 8b 7d 0c mov 0xc(%ebp),%edi 10089642: 01 f8 add %edi,%eax 10089644: 0f 6f 00 movq (%eax),%mm0 10089647: 0f 6f 0c 38 movq (%eax,%edi,1),%mm1 1008964b: 0f f6 c1 psadbw %mm1,%mm0 1008964e: 0f 7e c2 movd %mm0,%edx 10089651: 01 d1 add %edx,%ecx 10089653: 8d 04 78 lea (%eax,%edi,2),%eax 10089656: 0f 6f 00 movq (%eax),%mm0 10089659: 0f f6 c8 psadbw %mm0,%mm1 1008965c: 0f 7e ca movd %mm1,%edx 1008965f: 01 d1 add %edx,%ecx 10089661: 01 f8 add %edi,%eax 10089663: 0f 6f 08 movq (%eax),%mm1 10089666: 0f f6 c1 psadbw %mm1,%mm0 10089669: 0f 7e c2 movd %mm0,%edx 1008966c: 01 d1 add %edx,%ecx 1008966e: 0f 6f 04 38 movq (%eax,%edi,1),%mm0 10089672: 0f 6f 0c 78 movq (%eax,%edi,2),%mm1 10089676: 0f f6 c1 psadbw %mm1,%mm0 10089679: 0f 7e c2 movd %mm0,%edx 1008967c: 01 d1 add %edx,%ecx 1008967e: 8d 04 78 lea (%eax,%edi,2),%eax 10089681: 0f 6f 04 38 movq (%eax,%edi,1),%mm0 10089685: 0f f6 c8 psadbw %mm0,%mm1 10089688: 0f 7e ca movd %mm1,%edx 1008968b: 01 d1 add %edx,%ecx 1008968d: 0f 6f 0c 78 movq (%eax,%edi,2),%mm1 10089691: 0f f6 c1 psadbw %mm1,%mm0 10089694: 0f 7e c2 movd %mm0,%edx 10089697: 01 d1 add %edx,%ecx 10089699: 31 c0 xor %eax,%eax 1008969b: 8b 55 10 mov 0x10(%ebp),%edx 1008969e: d1 e2 shl %edx 100896a0: 39 d1 cmp %edx,%ecx 100896a2: 0f 9e c0 setle %al 100896a5: 0f 77 emms 100896a7: 8b 7d f8 mov 0xfffffff8(%ebp),%edi 100896aa: 89 ec mov %ebp,%esp 100896ac: 5d pop %ebp 100896ad: c3 ret Code:
cglobal vsad, 2,3 lea r2, [r1*3] movq mm0, [r0] movq mm1, [r0+r1] movq mm2, [r0+r1*2] movq mm3, [r0+r2] lea r0, [r0+r1*4] movq mm4, [r0] movq mm5, [r0+r1] movq mm6, [r0+r1*2] psadbw mm0, mm1 psadbw mm1, mm2 psadbw mm2, mm3 psadbw mm3, mm4 psadbw mm4, mm5 psadbw mm5, mm6 paddd mm0, mm1 paddd mm2, mm3 paddd mm4, mm5 paddd mm0, mm2 mov r2, r2m paddd mm0, mm4 shl r2 xor eax, eax movd r1, mm0 cmp r1, r2 setle al ret (This would be a whole lot easier if we could get an unstripped debug build, but like that'll ever happen... )
__________________
Follow x264 development progress | akupenguin quotes | x264 git status ffmpeg and x264-related consulting/coding contracts | Doom10 Last edited by Dark Shikari; 15th May 2008 at 07:57. |
||
15th May 2008, 08:25 | #23 | Link | |
DivX Team
Join Date: Oct 2001
Location: San Diego, CA
Posts: 24
|
Okay. That code is very old and it is part of ASP deblocking.
Quote:
BTW do you have any lossless files we could test? Last edited by sparky; 15th May 2008 at 08:32. |
|
15th May 2008, 08:45 | #25 | Link |
Registered User
Join Date: Nov 2002
Location: San Diego, CA
Posts: 936
|
Hey guys,
Although I don't like to do so, I feel I do need to point out that the license with this software does not allow reversing, decompiling, disassembling, and so forth. Dark Shikari: Your comments are welcome, but let's avoid a public disassembly of the binaries (I do realize we asked you to post a code snippet earlier). You can contact sparky by PM or e-mail for lower-level development issues if you like. Thanks for your understanding |
15th May 2008, 08:54 | #27 | Link | |
x264 developer
Join Date: Sep 2005
Posts: 8,666
|
Quote:
Of course, prohibiting "disassembly" in a license is not merely completely unenforceable (both legally and practically) but totally silly given how trivial disassembly is (objdump -d). And if you think that people actually obey rules about not disassembling programs, well, you might want to look at certain striking similarities between some code in CoreAVC, ffmpeg, and x264... Good question. Support for FGM in DivX would pave the way for x264 support of FGM.
__________________
Follow x264 development progress | akupenguin quotes | x264 git status ffmpeg and x264-related consulting/coding contracts | Doom10 Last edited by Dark Shikari; 15th May 2008 at 09:03. |
|
15th May 2008, 09:33 | #28 | Link |
Registered User
Join Date: Nov 2002
Location: San Diego, CA
Posts: 936
|
Thanks Dark Shikari. Instead of discussing the intricacies of dissassembly and software licenses, let me just say that this will save me hours of thoroughly entertaining chit chat with our legal team in the morning
Gabriel: We don't have FGM yet. Perhaps a discussion of features the x264 team is interested in working on and the order that they want to accomplish them might make for an interesting (but separate) thread. It's important to keep in mind that we're in the middle of preparing the future generation of DivX right now and we often can't elaborate too much on our roadmap ("Hey! Here's a super-fast H.264 decoder. Surprise!"), but there are times where we can aim to align our work to support other projects. If we can understand your priorities we can build a better codec. Thanks for your work on LAME btw.
__________________
DivX Plus Web Player 2.0 (MKV & AVI) (Embed generator) DivX H.264 Decoder with DXVA support Developer portal Last edited by DigitAl56K; 15th May 2008 at 09:52. |
15th May 2008, 15:17 | #29 | Link | |
x264 developer
Join Date: Sep 2005
Posts: 8,666
|
Heh, after more glancing around dissassembly my overall comment would be that if DivX has already managed to get faster than CoreAVC with this, lets just say they have a very wide margin in which to improve it . I suspect most of the efficiency here must be from overall code optimization rather than particularly ingenious ASM, which demonstrates yet again the sheer complexity of H.264 and how much good coding practices matter for performance. An oprofile of ffmpeg is a great way to see this; a massive amount of time is spent in many pure C functions (fill_caches, etc). Whatever design improvements DivX made in order to avoid a lot of this overhead: good job, the result is quite impressive.
The most important thing here is that there's finally a competitor for CoreAVC... perhaps this will force the Core guys to get working Edit: Perhaps FFDshow might have been used a bit too much as a model during DivX development... Quote:
__________________
Follow x264 development progress | akupenguin quotes | x264 git status ffmpeg and x264-related consulting/coding contracts | Doom10 Last edited by Dark Shikari; 15th May 2008 at 15:21. |
|
15th May 2008, 15:24 | #30 | Link |
CoreCodec Founder
Join Date: Oct 2001
Location: San Francisco
Posts: 1,421
|
Done.... CoreAVC 2.0 on the devel deck. But as I said in the other thread its good to see DivX with something 'new' and its great to have competition it only makes for better products.
__________________
Dan "BetaBoy" Marlin Ubiquitous Multimedia Technologies and Developer Tools http://corecodec.com |
15th May 2008, 16:11 | #31 | Link |
Turkey Machine
Join Date: Jan 2005
Location: Lowestoft, UK (but visit lots of places with bribes [beer])
Posts: 1,953
|
Wow! Nice going guys. Finally, an alternative to CoreAVC. Hows interlacing handled? I know it's only been a day, but if it can handle all the streams bob0r keeps throwing at CoreAVC and failing, then you lot are clear winners.
__________________
On Discworld it is clearly recognized that million-to-one chances happen 9 times out of 10. If the hero did not overcome huge odds, what would be the point? Terry Pratchett - The Science Of Discworld |
15th May 2008, 17:04 | #32 | Link |
Registered User
Join Date: Jan 2004
Posts: 567
|
Got some PAFF samples from DVB broadcasts working through GraphStudio (with Haali Media Splitter). DVBViewer with its demuxer crashes for the time being...
__________________
Bye Last edited by CiNcH; 15th May 2008 at 17:32. |
15th May 2008, 17:15 | #33 | Link | ||
DivX Team
Join Date: Oct 2001
Location: San Diego, CA
Posts: 24
|
Quote:
Quote:
Last edited by sparky; 15th May 2008 at 17:44. |
||
15th May 2008, 19:27 | #34 | Link |
Turkey Machine
Join Date: Jan 2005
Location: Lowestoft, UK (but visit lots of places with bribes [beer])
Posts: 1,953
|
So it's just sheer luck that the decoder gives the same errors that ffdshow does? If so, would be good to kick errors on both sides.
__________________
On Discworld it is clearly recognized that million-to-one chances happen 9 times out of 10. If the hero did not overcome huge odds, what would be the point? Terry Pratchett - The Science Of Discworld |
15th May 2008, 19:58 | #36 | Link | |
x264 developer
Join Date: Sep 2005
Posts: 8,666
|
Quote:
I'll probably go through it with oprofile in a few days to find what's actually used and see if its really as bad as the initial unused functions demonstrated, or if those are just outliers |
|
15th May 2008, 20:43 | #37 | Link |
Registered User
Join Date: Nov 2002
Location: San Diego, CA
Posts: 936
|
Once again, this is beta 1, i.e. not yet perfect! We will clean up the project sources as we move closer to a release.
I would look at it this way: Despite the flaws you think it may have if the decoder is already extremely fast and you believe it still has room for improvement then that is a good thing. If you think the decoder could be more efficient, but it is already outperforming the other decoders, then they could be more efficient still. Maybe we can try to be less negative unless it's really warranted BTW - I've just gone through my PM's and a whole bunch of people should now have access to the download. @BetaBoy: Agreed. The choice of two powerful H.264 decoders is better than one On DVBViewer: Seems to be a common problem there, we'll take a look at it. We have a few DVBViewer users in the Rémoulade group now. |
15th May 2008, 20:44 | #38 | Link | |
x264 developer
Join Date: Sep 2005
Posts: 8,666
|
Quote:
I was surprised that it managed to be significantly faster than Core given the lack of optimization on many levels, which obviously means that if those missing optimizations are added, it'll be even better. And everyone benefits from that. |
|
Tags |
coreavc, divx, h264 decoder, remoulade |
|
|