• <legend id='IDNDY'><style id='IDNDY'><dir id='IDNDY'><q id='IDNDY'></q></dir></style></legend>
    <tfoot id='IDNDY'></tfoot>
    <i id='IDNDY'><tr id='IDNDY'><dt id='IDNDY'><q id='IDNDY'><span id='IDNDY'><b id='IDNDY'><form id='IDNDY'><ins id='IDNDY'></ins><ul id='IDNDY'></ul><sub id='IDNDY'></sub></form><legend id='IDNDY'></legend><bdo id='IDNDY'><pre id='IDNDY'><center id='IDNDY'></center></pre></bdo></b><th id='IDNDY'></th></span></q></dt></tr></i><div id='IDNDY'><tfoot id='IDNDY'></tfoot><dl id='IDNDY'><fieldset id='IDNDY'></fieldset></dl></div>

    • <bdo id='IDNDY'></bdo><ul id='IDNDY'></ul>

    1. <small id='IDNDY'></small><noframes id='IDNDY'>

        Hotspot JIT 编译器是否有任何可以重现的指令重新排序?

        Is there any instruction reordering done by the Hotspot JIT compiler that can be reproduced?(Hotspot JIT 编译器是否有任何可以重现的指令重新排序?)

            • <bdo id='kVA8A'></bdo><ul id='kVA8A'></ul>
            • <tfoot id='kVA8A'></tfoot>

              1. <legend id='kVA8A'><style id='kVA8A'><dir id='kVA8A'><q id='kVA8A'></q></dir></style></legend>
                  <tbody id='kVA8A'></tbody>

                • <i id='kVA8A'><tr id='kVA8A'><dt id='kVA8A'><q id='kVA8A'><span id='kVA8A'><b id='kVA8A'><form id='kVA8A'><ins id='kVA8A'></ins><ul id='kVA8A'></ul><sub id='kVA8A'></sub></form><legend id='kVA8A'></legend><bdo id='kVA8A'><pre id='kVA8A'><center id='kVA8A'></center></pre></bdo></b><th id='kVA8A'></th></span></q></dt></tr></i><div id='kVA8A'><tfoot id='kVA8A'></tfoot><dl id='kVA8A'><fieldset id='kVA8A'></fieldset></dl></div>

                  <small id='kVA8A'></small><noframes id='kVA8A'>

                • 本文介绍了Hotspot JIT 编译器是否有任何可以重现的指令重新排序?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着跟版网的小编来一起学习吧!

                  问题描述

                  我们知道,一些 JIT 允许对对象初始化重新排序,例如,

                  someRef = new SomeObject();

                  可以分解为以下步骤:

                  objRef = 为 SomeObject 分配空间;//步骤1调用 SomeObject 的构造函数;//第2步someRef = objRef;//第三步

                  JIT 编译器可能会重新排序如下:

                  objRef = 为 SomeObject 分配空间;//步骤1someRef = objRef;//第三步调用 SomeObject 的构造函数;//第2步

                  即step2和step3可以被JIT编译器重新排序.尽管这在理论上有效重新排序,但我无法在 x86 平台下使用 Hotspot(jdk1.7) 重现它.

                  那么,Hotspot JIT 编译器是否有任何可以重现的指令重新排序?

                  <小时>

                  更新:我做了 其中节点的输入是节点操作的输入.每个节点根据其输入和操作定义一个值,并且该值在所有输出边上都可用.很明显,编译器看不到指针和整数存储节点之间的任何区别,因此唯一限制它的就是内存屏障.结果,为了减少寄存器压力,目标代码大小或其他编译器决定以这种奇怪(从用户的角度)顺序在基本块中调度指令.您可以使用以下选项(在 fastdebug build 中可用)在 Hotspot 中使用指令调度:-XX:+StressLCM-XX:+StressGCM.

                  As we know, some JIT allows reordering for object initialization, for example,

                  someRef = new SomeObject();
                  

                  can be decomposed into below steps:

                  objRef = allocate space for SomeObject; //step1
                  call constructor of SomeObject;         //step2
                  someRef = objRef;                    //step3
                  

                  JIT compiler may reorder it as below:

                  objRef = allocate space for SomeObject; //step1
                  someRef = objRef;                    //step3
                  call constructor of SomeObject;         //step2
                  

                  namely, step2 and step3 can be reordered by JIT compiler. Even though this is theoretically valid reordering, I was unable to reproduce it with Hotspot(jdk1.7) under x86 platform.

                  So, Is there any instruction reordering done by the Hotspot JIT comipler that can be reproduced?


                  Update: I did the test on my machine(Linux x86_64,JDK 1.8.0_40, i5-3210M ) using below command:

                  java -XX:-UseCompressedOops -XX:+UnlockDiagnosticVMOptions -XX:CompileCommand="print org.openjdk.jcstress.tests.unsafe.UnsafePublication::publish" -XX:CompileCommand="inline, org.openjdk.jcstress.tests.unsafe.UnsafePublication::publish" -XX:PrintAssemblyOptions=intel -jar tests-custom/target/jcstress.jar -f -1 -t .*UnsafePublication.* -v > log.txt 
                  

                  and I can see the tool reported something like:

                  [1] 5 ACCEPTABLE The object is published, at least 1 field is visible.

                  That meant an observer thread saw an uninitialized instance of MyObject.

                  However,I did NOT see assembly code generated like @Ivan's:

                  0x00007f71d4a15e34: mov r11d,DWORD PTR [rbp+0x10] ;getfield x 
                  0x00007f71d4a15e38: mov DWORD PTR [rax+0x10],r11d ;putfield x00 
                  0x00007f71d4a15e3c: mov DWORD PTR [rax+0x14],r11d ;putfield x01 
                  0x00007f71d4a15e40: mov DWORD PTR [rax+0x18],r11d ;putfield x02 
                  0x00007f71d4a15e44: mov DWORD PTR [rax+0x1c],r11d ;putfield x03 
                  0x00007f71d4a15e48: mov QWORD PTR [rbp+0x18],rax ;putfield o
                  

                  There seems to be no compiler reordering here.


                  Update2: @Ivan corrected me. I used wrong JIT command to capture the assembly code.After fixing this error, I can grap below assembly code:

                  0x00007f76012b18d5: mov    DWORD PTR [rax+0x10],ebp  ;*putfield x00
                  0x00007f76012b18d8: mov    QWORD PTR [r8+0x18],rax  ;*putfield o
                                                                  ; - org.openjdk.jcstress.tests.unsafe.generated.UnsafePublication_jcstress$Runner_publish::call@94 (line 156)
                  0x00007f76012b18dc: mov    DWORD PTR [rax+0x1c],ebp  ;*putfield x03
                  

                  Apparently, the compiler did the reordering which caused an unsafe publication.

                  解决方案

                  You can reproduce any compiler reordering. The right question is - which tool to use for this. In order to see compiler reordering - you have to follow down to assembly level with JITWatch(as it uses HotSpot's assembly log output) or JMH with LinuxPerfAsmProfiler.

                  Let's consider the following benchmark based on JMH:

                  public class ReorderingBench {
                  
                      public int[] array = new int[] {1 , -1,  1, -1};
                      public int sum = 0;
                  
                      @Benchmark
                      public void reorderGlobal() {
                          int[] a = array;
                          sum += a[1];
                          sum += a[0];
                          sum += a[3];
                          sum += a[2];
                      }
                  
                      @Benchmark
                      public int reorderLocal() {
                          int[] a = array;
                          int sum = 0;
                          sum += a[1];
                          sum += a[0];
                          sum += a[3];
                          sum += a[2];
                          return sum;
                      }
                  }
                  

                  Please note that array access is unordered. On my machine for method with global variable sum assembler output is:

                  mov    0xc(%rcx),%r8d         ;*getfield sum
                  ...
                  add    0x14(%r12,%r10,8),%r8d ;add a[1]
                  add    0x10(%r12,%r10,8),%r8d ;add a[0]
                  add    0x1c(%r12,%r10,8),%r8d ;add a[3]
                  add    0x18(%r12,%r10,8),%r8d ;add a[2]
                  

                  but for method with local variable sum access pattern was changed:

                  mov    0x10(%r12,%r10,8),%edx ;add a[0] <-- 0(0x10) first
                  add    0x14(%r12,%r10,8),%edx ;add a[1] <-- 1(0x14) second
                  add    0x1c(%r12,%r10,8),%edx ;add a[3]
                  add    0x18(%r12,%r10,8),%edx ;add a[2]
                  

                  You can play with c1 compiler optimizations c1_RangeCheckElimination

                  Update:

                  It is extremely hard to see only compiler reorderings from user's point of view, because you have to run bilions of samples to catch the racy behavior. Also it is important to separate compiler and hardware issues, for instance, weakly-ordered hardware like POWER can change behavior. Let's start from the right tool: jcstress - an experimental harness and a suite of tests to aid the research in the correctness of concurrency support in the JVM, class libraries, and hardware. Here is a reproducer where the instruction scheduler may decide to emit a few field stores, then publish the reference, then emit the rest of the field stores(also you can read about safe publications and instruction scheduling here). In some cases on my machine with Linux x86_64, JDK 1.8.0_60, i5-4300M compiler generates the following code:

                  mov    %edx,0x10(%rax)    ;*putfield x00                    
                  mov    %edx,0x14(%rax)    ;*putfield x01
                  mov    %edx,0x18(%rax)    ;*putfield x02
                  mov    %edx,0x1c(%rax)    ;*putfield x03
                  ...
                  movb   $0x0,0x0(%r13,%rdx,1)  ;*putfield o
                  

                  but sometimes:

                  mov    %ebp,0x10(%rax)    ;*putfield x00
                  ...
                  mov    %rax,0x18(%r10)    ;*putfield o  <--- publish here
                  mov    %ebp,0x1c(%rax)    ;*putfield x03
                  mov    %ebp,0x18(%rax)    ;*putfield x02
                  mov    %ebp,0x14(%rax)    ;*putfield x01
                  

                  Update 2:

                  Regarding to the question about performance benefits. In our case, this optimization(reordering) does not bring meaningful performance benefit it's just a side effect of the compiler's implementation. HotSpot uses sea of nodes graph to model data and control flow(you can read about graph-based intermediate representation here). The following picture shows the IR graph for our example(-XX:+PrintIdeal -XX:PrintIdealGraphLevel=1 -XX:PrintIdealGraphFile=graph.xml options + ideal graph visualizer): where inputs to a node are inputs to the node's operation. Each node defines a value based on it's inputs and operation, and that value is available on all output edges. It is obvious that compiler does not see any difference between pointer and integer store nodes so the only thing that limits it - is memory barrier. As a result in order to reduce register pressure, target code size or something else compiler decides to schedule instructions within the basic block in this strange(from user's point of view) order. You can play with instruction scheduling in Hotspot by using the following options(available in fastdebug build): -XX:+StressLCM and -XX:+StressGCM.

                  这篇关于Hotspot JIT 编译器是否有任何可以重现的指令重新排序?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持跟版网!

                  本站部分内容来源互联网,如果有图片或者内容侵犯了您的权益,请联系我们,我们会在确认后第一时间进行删除!

                  相关文档推荐

                  Bytecode features not available in the Java language(Java 语言中不可用的字节码功能)
                  ClassCastException because of classloaders?(ClassCastException 因为类加载器?)
                  How can I add a Javaagent to a JVM without stopping the JVM?(如何在不停止 JVM 的情况下将 Javaagent 添加到 JVM?)
                  Cannot load 64-bit SWT libraries on 32-bit JVM ( replacing SWT file )(无法在 32 位 JVM 上加载 64 位 SWT 库(替换 SWT 文件))
                  Encourage the JVM to GC rather than grow the heap?(鼓励 JVM 进行 GC 而不是增加堆?)
                  Why a sawtooth shaped graph?(为什么是锯齿形图形?)
                • <tfoot id='dRSWY'></tfoot>
                    <bdo id='dRSWY'></bdo><ul id='dRSWY'></ul>

                      <tbody id='dRSWY'></tbody>

                      <legend id='dRSWY'><style id='dRSWY'><dir id='dRSWY'><q id='dRSWY'></q></dir></style></legend>
                          <i id='dRSWY'><tr id='dRSWY'><dt id='dRSWY'><q id='dRSWY'><span id='dRSWY'><b id='dRSWY'><form id='dRSWY'><ins id='dRSWY'></ins><ul id='dRSWY'></ul><sub id='dRSWY'></sub></form><legend id='dRSWY'></legend><bdo id='dRSWY'><pre id='dRSWY'><center id='dRSWY'></center></pre></bdo></b><th id='dRSWY'></th></span></q></dt></tr></i><div id='dRSWY'><tfoot id='dRSWY'></tfoot><dl id='dRSWY'><fieldset id='dRSWY'></fieldset></dl></div>
                        • <small id='dRSWY'></small><noframes id='dRSWY'>