Welcome to Doom9's Forum, THE in-place to be for everyone interested in DVD conversion.

Before you start posting please read the forum rules. By posting to this forum you agree to abide by the rules.

 

Go Back   Doom9's Forum > Programming and Hacking > Development

Reply
 
Thread Tools Search this Thread Display Modes
Old 6th November 2019, 11:43   #1  |  Link
bassquake
Registered User
 
Join Date: Jan 2007
Posts: 39
How do I move inline assembly to external file?

I'm currently recompiling virtualdub plugins and need to move the inline assembly code to an external asm file but I dont know how to call it in the cpp.

As an example, here's a very simple bit of inline code:

Code:
void LUT_iSSE (Pixel32 *dst,int *LUT,int psize)
{
	__asm {
		mov     edi, [dst]
		mov     esi, [LUT]
		mov     ecx, [psize]

		align     16
		GLoop:      mov     eax, [edi]
				xor ebx, ebx
				mov     edx, eax
				mov     bl, ah
				and     edx, 0xff0000
				and eax, 0xff
				shr     edx, 16
				movd    mm0, [esi + eax * 4 + (512 * 4)]
				prefetchnta[edi + 512]
				por     mm0, [esi + ebx * 4 + (256 * 4)]
				por     mm0, [esi + edx * 4]
				movd[edi], mm0
				add     edi, 4
				dec     ecx
				jnz     GLoop
				emms
	}
}
Can someone post what I would put there instead and what the asm file should have in it or point me in the right direction online? I'm guessing Extern "C" is involved somewhere!

Greatly appreciated!
bassquake is offline   Reply With Quote
Old 6th November 2019, 13:37   #2  |  Link
StainlessS
HeartlessS Usurer
 
StainlessS's Avatar
 
Join Date: Dec 2009
Location: Over the rainbow
Posts: 7,266
I am very rusty on this stuff and not an intel assmebler guy, but maybe something like this

myheader.h
Code:
#ifndef __MYHEADER_H__          // Avoid multiple inclusions
    #define __MYHEADER_H__

    #include <windows.h>          // and whatever else
    #include <stdio.h>
    #include <stdlib.h>
    #include <time.h>
    #include <math.h>             // etc

    // Other stuff ...

    extern "C" {
        void __stdcall LUT_iSSE (Pixel32 *dst,int *LUT,int psize);
    }

#endif // __MYHEADER_H__
asm.c
Code:
#include myHeader.h

void  __stdcall  LUT_iSSE (Pixel32 *dst,int *LUT,int psize)
{
    __asm {
        mov     edi, [dst]
        mov     esi, [LUT]
        mov     ecx, [psize]

        align     16
        GLoop:      mov     eax, [edi]
                xor ebx, ebx
                mov     edx, eax
                mov     bl, ah
                and     edx, 0xff0000
                and eax, 0xff
                shr     edx, 16
                movd    mm0, [esi + eax * 4 + (512 * 4)]
                prefetchnta[edi + 512]
                por     mm0, [esi + ebx * 4 + (256 * 4)]
                por     mm0, [esi + edx * 4]
                movd[edi], mm0
                add     edi, 4
                dec     ecx
                jnz     GLoop
                emms
    }
}
I think that the asm file should b using __stdcall, but not sure.
Perhaps others will give better advice.

EDIT: One of the headers should define what Pixel32 is.
__________________
I sometimes post sober.
StainlessS@MediaFire ::: AND/OR ::: StainlessS@SendSpace

"Some infinities are bigger than other infinities", but how many of them are infinitely bigger ???

Last edited by StainlessS; 6th November 2019 at 13:42.
StainlessS is offline   Reply With Quote
Old 6th November 2019, 13:45   #3  |  Link
Groucho2004
Cantankerous Fossil
 
Groucho2004's Avatar
 
Join Date: Mar 2006
Location: A wretched hive of scum and villainy
Posts: 4,470
He wants to move it to a asm file. There is no inline in asm files. I recommend googling, there's tons of info on the subject. I think there's even a section on avisynth.nl.

Edit1: Here is the page on the wiki.
Edit2: Also, look at the code of plugins with external ASM modules.
__________________
Groucho's Avisynth Stuff

Last edited by Groucho2004; 6th November 2019 at 13:48.
Groucho2004 is offline   Reply With Quote
Old 6th November 2019, 13:51   #4  |  Link
Groucho2004
Cantankerous Fossil
 
Groucho2004's Avatar
 
Join Date: Mar 2006
Location: A wretched hive of scum and villainy
Posts: 4,470
Quote:
Originally Posted by StainlessS View Post
I am very rusty on this stuff
In my case, MASM 5.1 was the last Assembler I worked with, very early 90's.
__________________
Groucho's Avisynth Stuff
Groucho2004 is offline   Reply With Quote
Old 6th November 2019, 13:51   #5  |  Link
StainlessS
HeartlessS Usurer
 
StainlessS's Avatar
 
Join Date: Dec 2009
Location: Over the rainbow
Posts: 7,266
Thanks G2K4, here on Wiki, Separate assembly modules :- http://avisynth.nl/index.php/Filter_...ler_optimizing

EDIT:
Quote:
Originally Posted by Groucho2004 View Post
In my case, MASM 5.1 was the last Assembler I worked with, very early 90's.
well I have not done any intel at all, I am aware that there are different calling conventions (I think Pascal is one of them apart from __stdcall),
but have little knowledge other than that.

Perhaps I might one day get into the intrinsics thing, but am put off by the whole menagerie of different CPU instruction requirements.

EDIT: Well, I did do a teensy-weensy bit of 8080 back in 1981 [Zlilog Z80A was better].
__________________
I sometimes post sober.
StainlessS@MediaFire ::: AND/OR ::: StainlessS@SendSpace

"Some infinities are bigger than other infinities", but how many of them are infinitely bigger ???

Last edited by StainlessS; 7th November 2019 at 14:05.
StainlessS is offline   Reply With Quote
Old 6th November 2019, 14:02   #6  |  Link
StainlessS
HeartlessS Usurer
 
StainlessS's Avatar
 
Join Date: Dec 2009
Location: Over the rainbow
Posts: 7,266
G2K4, is the posted header part of use as is, though ?
[of interest to both me and the bassquake]

EDIT: I've never had to use extern "C", being only a C programmer, I think that its a CPP thing.

Quote:
Also, look at the code of plugins with external ASM modules.
I nearly suggested same, code of VirtualDub plugins with external ASM modules, there is bound to be quite a few.
__________________
I sometimes post sober.
StainlessS@MediaFire ::: AND/OR ::: StainlessS@SendSpace

"Some infinities are bigger than other infinities", but how many of them are infinitely bigger ???

Last edited by StainlessS; 6th November 2019 at 14:06.
StainlessS is offline   Reply With Quote
Old 6th November 2019, 14:06   #7  |  Link
Groucho2004
Cantankerous Fossil
 
Groucho2004's Avatar
 
Join Date: Mar 2006
Location: A wretched hive of scum and villainy
Posts: 4,470
Quote:
Originally Posted by StainlessS View Post
Perhaps I might one day get into the intrinsics thing, but am put off by the whole menagerie of different CPU instruction requirements.
Intrinsics seem to be the way to go for time critical applications but for what I'm writing nowadays, plain C/C++ is sufficient.
__________________
Groucho's Avisynth Stuff
Groucho2004 is offline   Reply With Quote
Old 6th November 2019, 14:10   #8  |  Link
Groucho2004
Cantankerous Fossil
 
Groucho2004's Avatar
 
Join Date: Mar 2006
Location: A wretched hive of scum and villainy
Posts: 4,470
Quote:
Originally Posted by StainlessS View Post
G2K4, is the posted header part of use as is, though ?
[of interest to both me and the bassquake]
Dunno.
__________________
Groucho's Avisynth Stuff
Groucho2004 is offline   Reply With Quote
Old 6th November 2019, 14:14   #9  |  Link
StainlessS
HeartlessS Usurer
 
StainlessS's Avatar
 
Join Date: Dec 2009
Location: Over the rainbow
Posts: 7,266
Good, we can Dunno together then

Bassquake, post how you get on please.
__________________
I sometimes post sober.
StainlessS@MediaFire ::: AND/OR ::: StainlessS@SendSpace

"Some infinities are bigger than other infinities", but how many of them are infinitely bigger ???
StainlessS is offline   Reply With Quote
Old 6th November 2019, 14:15   #10  |  Link
Groucho2004
Cantankerous Fossil
 
Groucho2004's Avatar
 
Join Date: Mar 2006
Location: A wretched hive of scum and villainy
Posts: 4,470
Quote:
Originally Posted by StainlessS View Post
Good, we can Dunno together then
Yep, as usual.
__________________
Groucho's Avisynth Stuff
Groucho2004 is offline   Reply With Quote
Old 6th November 2019, 16:01   #11  |  Link
StainlessS
HeartlessS Usurer
 
StainlessS's Avatar
 
Join Date: Dec 2009
Location: Over the rainbow
Posts: 7,266
VirtualDub2 on sourceforge, hqdn3d, has asm, no idea if of use:- https://sourceforge.net/projects/vdf...files/plugins/

EDIT: It uses YASM assembler (which I think most/all VD stuff uses).
__________________
I sometimes post sober.
StainlessS@MediaFire ::: AND/OR ::: StainlessS@SendSpace

"Some infinities are bigger than other infinities", but how many of them are infinitely bigger ???

Last edited by StainlessS; 6th November 2019 at 16:04.
StainlessS is offline   Reply With Quote
Old 6th November 2019, 17:40   #12  |  Link
jpsdr
Registered User
 
Join Date: Oct 2002
Location: France
Posts: 1,764
My plugins use external asm, and the asm provided with Visual Studio, no need to install any another stuff, just VS is enough.
You can check in my github to see exemples of code, for both x86 and x64 versions.
__________________
My github.
jpsdr is offline   Reply With Quote
Old 7th November 2019, 11:52   #13  |  Link
bassquake
Registered User
 
Join Date: Jan 2007
Posts: 39
Quote:
Originally Posted by jpsdr View Post
My plugins use external asm, and the asm provided with Visual Studio, no need to install any another stuff, just VS is enough.
You can check in my github to see exemples of code, for both x86 and x64 versions.
Thanks for the pointer. I tried using your code to work off. I think I'm close but I get these 4 errors in the asm.

The code in the cpp is:

Code:
extern "C" void lut_isse(Pixel32 *dst,int *LUT,int psize);
and the asm file is:

Code:
.586    
.mmx   
.xmm
.model flat, c                                           
.code      

lut_isse proc dst:dword,LUT:dword,psize:dword

public lut_isse

			mov     edi,[dst]
            mov     esi,[LUT]
            mov     ecx,[psize]

            align     16
GLoop:      mov     eax,[edi]
            xor     ebx,ebx
            mov     edx,eax
            mov     bl,ah
            and     edx,0xff0000	;A2206 missing operator in expression
            and     eax,0xff	;A2206 missing operator in expression
            shr     edx,16
            movd    mm0,[esi+eax*4+(512*4)]	;A2070 invalid instruction operands
            prefetchnta [edi+512]
            por     mm0,[esi+ebx*4+(256*4)]
            por     mm0,[esi+edx*4        ]
            movd    [edi],mm0	;A2070 invalid instruction operands
            add     edi,4
            dec     ecx
            jnz     GLoop
            emms

			ret
lut_isse endp

END
I've marked where and what the 4 errors are.

I don't know assembly and was hoping wouldn't have to rewrite any of it.

Last edited by bassquake; 7th November 2019 at 12:00.
bassquake is offline   Reply With Quote
Old 7th November 2019, 14:54   #14  |  Link
jpsdr
Registered User
 
Join Date: Oct 2002
Location: France
Posts: 1,764
You're not in C...
Code:
and edx,0ff0000h
and eax,0ffh
Also, i think it always needs to begin with a number. If i remember properly, this doesn't work :
Code:
and eax,ffh
but this will :
Code:
and eax,0ffh

Try this :
Code:
movd    mm0,dword ptr[esi+eax*4+(512*4)]
....
movd   dword ptr[edi],mm0
I think transfert default size is the operand, so, not specifying ptr size may probably result as if you've written this :
Code:
movd   qword ptr[edi],mm0
Which is inconsistant of course, movd with qword...

Also, i would write the begining like this (but maybe it will produce the same result).
Code:
mov     edi,dst
mov     esi,LUT
mov     ecx,psize
Your code is 32 bit only, so i will just state the 32 bits rules.

The only registers you can alter without saving them are : eax, ecx, edx and all the mm* and xmm* registers.
If you change esi, edi, ebx, ebp or esp, you'll need to backup and restore them.

Note that if you change ebp, after you'll not be able anymore to do things like this :
Code:
mov     edi,dst
I think the compiler add implicit save/restore code of ebp at the begining and end of function, and use it afterward when you acces to the stack parameters.
This is what all the .model flat, c and lut_isse proc ... are for.

So, the start of your function should be :
Code:
public lut_isse

push esi
push edi
push ebx

mov     edi,dst
...
and the end :
Code:
....
emms

pop ebx
pop edi
pop esi

ret
Also :
Code:
dec ecx
jnz GLoop
can be shorten to :
Code:
loop GLoop
I personnaly avoid the use of "int", as it's something with to much variation size possibility and no real "spec".
I always use data i'm sure of the size, like uint32_t for fixed size in both 32/64 bits, or size_t for 32 bits in 32 bits, 64 bits in 64 bits (unsigned), and ptrdiff_t for pointer offset, as it also adapt the size for 32/64 bits, but it's signed.
__________________
My github.

Last edited by jpsdr; 7th November 2019 at 15:06.
jpsdr is offline   Reply With Quote
Old 7th November 2019, 17:29   #15  |  Link
bassquake
Registered User
 
Join Date: Jan 2007
Posts: 39
Cool thanks. I got it working with the following:

Code:
.586    
.mmx   
.xmm
.model flat, c                                           
.code      

lut_isse proc dst:dword,LUT:dword,psize:dword

public lut_isse

push esi
push edi
push ebx

			mov     edi,dst
			mov     esi,LUT
			mov     ecx,psize

            align     16
GLoop:      mov     eax,[edi]
            xor     ebx,ebx
            mov     edx,eax
            mov     bl,ah
            and     edx,0ff0000h	;A2206 missing operator in expression
            and     eax,0ffh	;A2206 missing operator in expression
            shr     edx,16
            movd    mm0,dword ptr[esi+eax*4+(512*4)]	;A2070 invalid instruction operands
            prefetchnta [edi+512]
            por     mm0,[esi+ebx*4+(256*4)]
            por     mm0,[esi+edx*4        ]
            movd	dword ptr[edi],mm0	;A2070 invalid instruction operands
            add     edi,4
            loop GLoop
            emms

			pop ebx
			pop edi
			pop esi

			ret
lut_isse endp

END
Now need to see if can convert to 64 bit! A project for another time.
bassquake is offline   Reply With Quote
Old 7th November 2019, 19:17   #16  |  Link
shekh
Registered User
 
Join Date: Mar 2015
Posts: 708
Probably better to convert this to plain c++ (btw maybe you already have it), this asm is not doing anything special.
What is the plugin?
__________________
VirtualDub2
shekh is offline   Reply With Quote
Old 8th November 2019, 10:10   #17  |  Link
bassquake
Registered User
 
Join Date: Jan 2007
Posts: 39
Its the Color Balance 1.1 plugin from here:

https://emiliano.deepabyss.org/
bassquake is offline   Reply With Quote
Old 8th November 2019, 13:23   #18  |  Link
shekh
Registered User
 
Join Date: Mar 2015
Posts: 708
You can get rid of asm by removing the lines
Code:
  if ((CPUF_SUPPORTS_INTEGER_SSE & ff->getCPUFlags()))
     LUT_iSSE (dst,mfd->Lut,psize);
  else
Wondering if there is performance difference.
Notes for x64:
SetWindowLong -> SetWindowLongPtr
GetWindowLong -> GetWindowLongPtr
__________________
VirtualDub2
shekh is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +1. The time now is 16:06.


Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2019, vBulletin Solutions Inc.