将零、负和正映射到 0、1、2 的无分支代码

Branchless code that maps zero, negative, and positive to 0, 1, 2(将零、负和正映射到 0、1、2 的无分支代码)
本文介绍了将零、负和正映射到 0、1、2 的无分支代码的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着跟版网的小编来一起学习吧!

问题描述

限时送ChatGPT账号..

编写一个无分支函数,如果两个有符号整数之间的差为零、负数或正数,则返回 0、1 或 2.

Write a branchless function that returns 0, 1, or 2 if the difference between two signed integers is zero, negative, or positive.

这是一个带有分支的版本:

Here's a version with branching:

int Compare(int x, int y)
{
    int diff = x - y;
    if (diff == 0)
        return 0;
    else if (diff < 0)
        return 1;
    else
        return 2;
}

这是一个可能更快的版本,具体取决于编译器和处理器:

Here's a version that may be faster depending on compiler and processor:

int Compare(int x, int y)
{
    int diff = x - y;
    return diff == 0 ? 0 : (diff < 0 ? 1 : 2);
}

你能想出一个没有分支的更快的吗?

Can you come up with a faster one without branches?

总结

我进行基准测试的 10 个解决方案具有相似的性能.实际数字和获胜者因编译器 (icc/gcc)、编译器选项(例如,-O3、-march=nocona、-fast、-xHost)和机器而异.佳能的解决方案在许多基准测试中表现良好,但性能优势同样微乎其微.我很惊讶在某些情况下,某些解决方案比带有分支的幼稚解决方案慢.

The 10 solutions I benchmarked had similar performance. The actual numbers and winner varied depending on compiler (icc/gcc), compiler options (e.g., -O3, -march=nocona, -fast, -xHost), and machine. Canon's solution performed well in many benchmark runs, but again the performance advantage was slight. I was surprised that in some cases some solutions were slower than the naive solution with branches.

推荐答案

int Compare(int x, int y) {
     return (x < y) + (y < x) << 1;
}

仅按位?猜猜并且 > 不计算,然后?

Bitwise only? Guess < and > don't count, then?

int Compare(int x, int y) {
    int diff = x - y;
    return (!!diff) | (!!(diff & 0x80000000) << 1);
}

但是有那个讨厌的-.

换个方向.

嗯,再试一次:

int Compare(int x, int y) {
    int diff = y - x;
    return (!!diff) << ((diff >> 31) & 1);
}

但我猜测 !! 没有标准的 ASM 指令.此外,<< 可以替换为 +,具体取决于哪个更快...

But I'm guessing there's no standard ASM instruction for !!. Also, the << can be replaced with +, depending on which is faster...

点点滴滴很有趣!

嗯,我刚刚了解了 setnz.

我还没有检查汇编器的输出(但这次我确实测试了一下),如果幸运的话,它可以节省一整条指令!:

I haven't checked the assembler output (but I did test it a bit this time), and with a bit of luck it could save a whole instruction!:

subl  %edi, %esi
setnz %eax
sarl  $31, %esi
andl  $1, %esi
sarl  %eax, %esi
mov   %esi, %eax
ret

漫步很有趣.

我需要睡觉.

这篇关于将零、负和正映射到 0、1、2 的无分支代码的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持跟版网!

本站部分内容来源互联网,如果有图片或者内容侵犯了您的权益,请联系我们,我们会在确认后第一时间进行删除!

相关文档推荐

OpenGL transforming objects with multiple rotations of Different axis(OpenGL 变换不同轴多次旋转的对象)
GLFW first responder error(GLFW 第一响应者错误)
SOIL not linking correctly(SOIL 连接不正确)
Core profile vs version string? Only getting GLSL 1.3/OGL 3.0 in mesa 10.0.1(核心配置文件与版本字符串?在 mesa 10.0.1 中只获得 GLSL 1.3/OGL 3.0)
What is the range of OpenGL texture ID?(OpenGL 纹理 ID 的范围是多少?)
How taxing are OpenGL glDrawElements() calls compared to basic logic code?(与基本逻辑代码相比,OpenGL glDrawElements() 调用的繁重程度如何?)