| Age | Commit message (Collapse) | Author | 
|---|
|  |  | 
|  | Operations done before the main half float operation (like HAdd) were
managing a packed value instead of the unpacked one. Adding an unpacked
operation allows us to drop the per-operand MetaHalfArithmetic entry,
simplifying the code overall. | 
|  | GLSL decompilation for HMergeH0 was wrong. This addresses that issue. | 
|  |  | 
|  |  | 
|  |  | 
|  | Using static here might be faster at runtime, but it adds a heap
allocation called before main. | 
|  | ldr: Minor amendments to IPC-related parameters | 
|  | Set Pixel Format to Z32 if its R32F and depth compare enabled, and Implement format ZF32_X24S8 | 
|  | Add a toggle to force 30FPS mode | 
|  | fsp_srv: Minor cleanup related changes | 
|  | gl_shader_manager: Move code to source file and minor clean up | 
|  | Frontend: Migrate to QOpenGLWindow and support shared contexts | 
|  | ui_settings: Rename game directory variables | 
|  | common/scope_exit: Replace std::move with std::forward in ScopeExit() | 
|  | common/swap: Minor cleanup and improvements to byte swapping functions | 
|  |  | 
|  | Uses arithmetic that can be identified more trivially by compilers for
optimizations. e.g. Rather than shifting the halves of the value and
then swapping and combining them, we can swap them in place.
e.g. for the original swap32 code on x86-64, clang 8.0 would generate:
    mov     ecx, edi
    rol     cx, 8
    shl     ecx, 16
    shr     edi, 16
    rol     di, 8
    movzx   eax, di
    or      eax, ecx
    ret
while GCC 8.3 would generate the ideal:
    mov     eax, edi
    bswap   eax
    ret
now both generate the same optimal output.
MSVC used to generate the following with the old code:
    mov     eax, ecx
    rol     cx, 8
    shr     eax, 16
    rol     ax, 8
    movzx   ecx, cx
    movzx   eax, ax
    shl     ecx, 16
    or      eax, ecx
    ret     0
Now MSVC also generates a similar, but equally optimal result as clang/GCC:
    bswap   ecx
    mov     eax, ecx
    ret     0
====
In the swap64 case, for the original code, clang 8.0 would generate:
    mov     eax, edi
    bswap   eax
    shl     rax, 32
    shr     rdi, 32
    bswap   edi
    or      rax, rdi
    ret
(almost there, but still missing the mark)
while, again, GCC 8.3 would generate the more ideal:
    mov     rax, rdi
    bswap   rax
    ret
now clang also generates the optimal sequence for this fallback as well.
This is a case where MSVC unfortunately falls short, despite the new
code, this one still generates a doozy of an output.
    mov     r8, rcx
    mov     r9, rcx
    mov     rax, 71776119061217280
    mov     rdx, r8
    and     r9, rax
    and     edx, 65280
    mov     rax, rcx
    shr     rax, 16
    or      r9, rax
    mov     rax, rcx
    shr     r9, 16
    mov     rcx, 280375465082880
    and     rax, rcx
    mov     rcx, 1095216660480
    or      r9, rax
    mov     rax, r8
    and     rax, rcx
    shr     r9, 16
    or      r9, rax
    mov     rcx, r8
    mov     rax, r8
    shr     r9, 8
    shl     rax, 16
    and     ecx, 16711680
    or      rdx, rax
    mov     eax, -16777216
    and     rax, r8
    shl     rdx, 16
    or      rdx, rcx
    shl     rdx, 16
    or      rax, rdx
    shl     rax, 8
    or      rax, r9
    ret     0
which is pretty unfortunate. | 
|  | vk_shader_decompiler: Implement a SPIR-V decompiler | 
|  | kernel/svc: Deglobalize the supervisor call handlers | 
|  | kernel: Make handle type declarations constexpr | 
|  | Allows the compiler to inform when the result of a swap function is
being ignored (which is 100% a bug in all usage scenarios). We also mark
them noexcept to allow other functions using them to be able to be
marked as noexcept and play nicely with things that potentially inspect
"nothrowability". | 
|  | Including every OS' own built-in byte swapping functions is kind of
undesirable, since it adds yet another build path to ensure compilation
succeeds on.
Given we only support clang, GCC, and MSVC for the time being, we can
utilize their built-in functions directly instead of going through the
OS's API functions.
This shrinks the overall code down to just
if (msvc)
  use msvc's functions
else if (clang or gcc)
  use clang/gcc's builtins
else
  use the slow path | 
|  | We don't plan to support host 32-bit ARM execution environments, so this
is essentially dead code. | 
|  | The template type here is actually a forwarding reference, not an rvalue
reference in this case, so it's more appropriate to use std::forward to
preserve the value category of the type being moved. | 
|  | Some objects declare their handle type as const, while others declare it
as constexpr. This makes the const ones constexpr for consistency, and
prevent unexpected compilation errors if these happen to be attempted to be
used within a constexpr context. | 
|  |  | 
|  | FastLayeredCopySurface | 
|  | video_core: Implement API agnostic view based texture cache | 
|  | Correct Fermi Copy on Linear Textures. | 
|  |  | 
|  | This doesn't modify instance state, so it can be made const. | 
|  | The initial two words indicate a process ID. Also UnloadNro only
specifies one address, not two. | 
|  |  | 
|  |  | 
|  |  | 
|  |  | 
|  |  | 
|  |  | 
|  |  | 
|  |  | 
|  |  | 
|  |  | 
|  | sirit is a runtime assembler for SPIR-V | 
|  | IDirectory's Read() function doesn't take any input parameters. It only
uses the output parameters that we already provide. | 
|  | These indicate options that alter how a read/write is performed.
Currently we don't need to handle these, as the only one that seems to
be used is for writes, but all the custom options ever seem to do is
immediate flushing, which we already do by default. | 
|  | gl_rasterizer: Use ARB_multi_bind to update buffers with a single call per drawcall | 
|  | kernel/server_session: Remove obsolete TODOs | 
|  | These are holdovers from Citra. | 
|  | Remove unnecessary bounding in LD_C |