OpenMP Optimization Remarks¶
The OpenMP-Aware optimization pass is able to
generate compiler remarks for performed and missed optimisations. To emit them,
pass these options to the Clang invocation: -Rpass=openmp-opt
-Rpass-analysis=openmp-opt -Rpass-missed=openmp-opt
. For more information and
features of the remark system, consult the clang documentation:
OpenMP Remarks¶
Diagnostics Number |
Diagnostics Kind |
Diagnostics Description |
---|---|---|
Analysis |
Potentially unknown OpenMP target region caller. |
|
Analysis |
Parallel region is used in unknown / unexpected ways. Will not attempt to rewrite the state machine. |
|
Analysis |
Parallel region is not called from a unique kernel. Will not attempt to rewrite the state machine. |
|
Optimization |
Moving globalized variable to the stack. |
|
Optimization |
Replaced globalized variable with X bytes of shared memory. |
|
Missed |
Found thread data sharing on the GPU. Expect degraded performance due to data globalization. |
|
Missed |
Could not move globalized variable to the stack. Variable is potentially captured in call. Mark parameter as __attribute__((noescape)) to override. |
|
Optimization |
Transformed generic-mode kernel to SPMD-mode. |
|
Analysis |
Value has potential side effects preventing SPMD-mode execution. Add __attribute__((assume("ompx_spmd_amenable"))) to the called function to override. |
|
Optimization |
Removing unused state machine from generic-mode kernel. |
|
Optimization |
Rewriting generic-mode kernel with a customized state machine. |
|
Analysis |
Generic-mode kernel is executed with a customized state machine that requires a fallback. |
|
Analysis |
Call may contain unknown parallel regions. Use __attribute__((assume(“omp_no_parallelism”))) to override. |
|
Analysis |
Could not internalize function. Some optimizations may not be possible. |
|
Optimization |
Parallel region merged with parallel region at <location>. |
|
Optimization |
Removing parallel region with no side-effects. |
|
Optimization |
OpenMP runtime call <call> deduplicated. |
|
Optimization |
Replacing OpenMP runtime call <call> with <value>. |
|
Optimization |
Redundant barrier eliminated. (device only) |