Linus Torvalds writes: (Summary)
That said, looking at 2fbd7af5af86, I do note that the code generation
is horribly stupid.
is horribly stupid.
It's due to two different issues:
It's due to two different issues:
(a) the x86 asm constraints for that inline asm is nasty, and requires a register for 'size', even though an immediate works just fine.
fine.
(b) the "cmp" is inside the asm, so gcc can't combine it with the *other* cmp in the C code.
*other* cmp in the C code.
Fixing (a) is easy:
Fixing (a) is easy:
+++ b/arch/x86/include/asm/barrier.h
@@ -43 +43 @@ static inline unsigned long
array_index_mask_nospec(unsigned long index,
- :"r"(size),"r" (index) + :"ir"(size),"r" (index) + :"ir"(size),"r" (index) but fixing (b) looks fundamentally hard.
is horribly stupid.
It's due to two different issues:
It's due to two different issues:
(a) the x86 asm constraints for that inline asm is nasty, and requires a register for 'size', even though an immediate works just fine.
fine.
(b) the "cmp" is inside the asm, so gcc can't combine it with the *other* cmp in the C code.
*other* cmp in the C code.
Fixing (a) is easy:
Fixing (a) is easy:
+++ b/arch/x86/include/asm/barrier.h
@@ -43 +43 @@ static inline unsigned long
array_index_mask_nospec(unsigned long index,
- :"r"(size),"r" (index) + :"ir"(size),"r" (index) + :"ir"(size),"r" (index) but fixing (b) looks fundamentally hard.