问题描述
鉴于零比雷埃夫斯对另一个问题的回答,我们知道了
x = tuple(set([1, "a", "b", "c", "z", "f"]))
y = tuple(set(["a", "b", "c", "z", "f", 1]))
print(x == y)
使用 Trueyou-care-about-security/" rel="nofollow noreferrer">哈希随机化已启用.为什么是 85%?
Prints True
about 85% of the time with hash randomization enabled. Why 85%?
推荐答案
我假设这个问题的任何读者都读过:
I'm going to assume any readers of this question to have read both:
零比雷埃夫斯的回答和
我对 CPython 字典的解释.
首先要注意的是哈希随机化是在解释器启动时决定的.
两个集合中每个字母的哈希值都是相同的,所以唯一重要的是是否存在冲突(会影响顺序).
The hash of each letter will be the same for both sets, so the only thing that can matter is if there is a collision (where order will be affected).
通过第二个链接的推导,我们知道这些集合的后备数组从长度 8 开始:
By the deductions of that second link we know the backing array for these sets starts at length 8:
_ _ _ _ _ _ _ _
在第一种情况下,我们插入 1
:
In the first case, we insert 1
:
_ 1 _ _ _ _ _ _
然后插入其余部分:
α 1 ? ? ? ? ? ?
然后将其重新散列到大小为 32:
Then it is rehashed to size 32:
1 can't collide with α as α is an even hash
↓ so 1 is inserted at slot 1 first
? 1 ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?
在第二种情况下,我们插入其余部分:
In the second case, we insert the rest:
? β ? ? ? ? ? ?
然后尝试插入1:
Try to insert 1 here, but will
↓ be rehashed if β exists
? β ? ? ? ? ? ?
然后会重新散列:
Try to insert 1 here, but will
be rehashed if β exists and has
↓ not rehashed somewhere else
? β ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?
所以迭代顺序是否不同,完全取决于β是否存在.
So whether the iteration orders are different depends solely on whether β exists.
β 的概率是 5 个字母中的任何一个将散列到 1 模 8 和散列到 1 模 32 的机会.
The chance of a β is the chance that any of the 5 letters will hash to 1 modulo 8 and hash to 1 modulo 32.
因为任何散列到 1 模 32 也散列到 1 模 8,我们想要找到 32 个插槽中,五个插槽之一位于插槽 1 的机会:
Since anything that hashes to 1 modulo 32 also hashes to 1 modulo 8, we want to find the chance that of the 32 slots, one of the five is in slot 1:
5 (number of letters) / 32 (number of slots)
5/32 是 0.15625,因此两个集合结构之间的顺序有 15.625% 的可能性不同.
5/32 is 0.15625, so there is a 15.625% chance of the orders being different between the two set constructions.
一点也不奇怪,这正是零比雷埃夫斯所测量的.
Not very strangely at all, this is exactly what Zero Piraeus measured.
从技术上讲,即使这并不明显.由于重新散列,我们可以假装 5 个散列中的每一个都是唯一的,但是由于线性探测,实际上更有可能发生捆绑"结构……但是因为我们只查看单个插槽是否被占用,这并没有实际上并没有影响到我们.
这篇关于为什么 tuple(set([1, “a", “b", “c", “z", “f"])) == tuple(set([“a",";b"、“c"、“z"、“f"、1])) 85% 的时间启用哈希随机化?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持跟版网!