trinity-devel@lists.pearsoncomputing.net

Message: previous - next
Month: March 2012

Re: [trinity-devel] mlt - assembler error have_mmx.S: Assembler messages: Error: operand type mismatch for `push'

From: /dev/ammo42 <mickeytintincolle@...>
Date: Fri, 30 Mar 2012 21:22:52 +0200
On Fri, 30 Mar 2012 15:36:34 +0100
Nix <nix@...> wrote:

> On 29 Mar 2012, dev stated:
> 
> > On Sat, 24 Mar 2012 10:24:53 -0500
> > "David C. Rankin" <drankinatty@...> wrote:
> >
> >> On 03/24/2012 02:35 AM, /dev/ammo42 wrote:
> >> >>   Does this mean I'm missing a library that it needs. I can't
> >> >> ready
> >> >> > the assembler code for nothing, so I hope so. What's the
> >> >> > trick?
> >> > From http://www.x86-64.org/documentation/assembly.html :
> >> > push/pop with 32-bit registers are illegal in amd64. You need to
> >> > disable compilation of this file as it is x86 only assembly.
> >> 
> >> You are good. Thank you. So the source needs a rewrite for x86_64
> >> to eliminate the push/pop. That will be set in the lower priority
> >> column.
> > Actually a --disable--mmx option is documented on configure.
> > And for the rewrite, replacing e?x by r?x in assembler files won't
> > be enough, some other code also has to be changed to accommodate the
> > different calling conventions (function arguments on linux-x86 are
> > on the stack, but the first ones on linux-amd64 are in the 64-bit
> > registers). Perhaps using inline assembly would allow having only
> > one version of the assembly code.
> 
> More generally, if you're rewriting it for x86-64 anyway, keeping it
> MMX seems pretty much pointless. Just go all the way to SSE, or
> better that SSE2: you'll cut off almost no hardware and be much more
> useful in future. (AVX is probably a step too far: a lot of
> toolchains don't support it yet, and I don't know anyone with AVX
> hardware.)
The amd64 specifications include MMX and SSE2 (with 16 SSE registers
instead of 8 in x86), so every amd64 processor can do SSE2. And also,
composite_line_yuv_mmx.S already has a SSE opcode (pextrw).
But you are right, if operations can be easily parallelized we had
better use SSE2, which allows using MMX opcodes on SSE registers which
are 2 times wider than the MMX ones. And allow the use of SSE2 on x86
if cpuid detects its presence.
>