diff options
| author | ReinUsesLisp <reinuseslisp@airmail.cc> | 2021-07-26 04:24:26 -0300 | 
|---|---|---|
| committer | ReinUsesLisp <reinuseslisp@airmail.cc> | 2021-07-26 04:58:02 -0300 | 
| commit | 66a0cedba39cabba30c756626d7b58bd0e519d8e (patch) | |
| tree | 909321c59c4d184391eea2031c0dd8aee61e6b5c /CMakeModules/GenerateSCMRev.cmake | |
| parent | 09fb41dc63eeda3a82580f119704e691ead9e76a (diff) | |
shader: Fold integer FMA from Nvidia's pattern
Fold shaders doing "a * b + c" on integers from the pattern generated by
Nvidia's GL compiler.
On a somewhat complex compute shader it reduces the code size by 16
instructions from 2 matches on Turing GPUs.
On Intel as extracted from KHR_pipeline_executable_properties:
Before the optimization:
```
Instruction Count: 2057
Basic Block Count: 45
Scratch Memory Size: 14752
Spill Count: 232
Fill Count: 261
SEND Count: 610
Cycle Count: 11325
```
After the optimization:
```
Instruction Count: 2046
Basic Block Count: 44
Scratch Memory Size: 13728
Spill Count: 219
Fill Count: 268
SEND Count: 604
Cycle Count: 11367
```
Diffstat (limited to 'CMakeModules/GenerateSCMRev.cmake')
0 files changed, 0 insertions, 0 deletions
