Age | Commit message (Collapse) | Author |
|
Uses arithmetic that can be identified more trivially by compilers for
optimizations. e.g. Rather than shifting the halves of the value and
then swapping and combining them, we can swap them in place.
e.g. for the original swap32 code on x86-64, clang 8.0 would generate:
mov ecx, edi
rol cx, 8
shl ecx, 16
shr edi, 16
rol di, 8
movzx eax, di
or eax, ecx
ret
while GCC 8.3 would generate the ideal:
mov eax, edi
bswap eax
ret
now both generate the same optimal output.
MSVC used to generate the following with the old code:
mov eax, ecx
rol cx, 8
shr eax, 16
rol ax, 8
movzx ecx, cx
movzx eax, ax
shl ecx, 16
or eax, ecx
ret 0
Now MSVC also generates a similar, but equally optimal result as clang/GCC:
bswap ecx
mov eax, ecx
ret 0
====
In the swap64 case, for the original code, clang 8.0 would generate:
mov eax, edi
bswap eax
shl rax, 32
shr rdi, 32
bswap edi
or rax, rdi
ret
(almost there, but still missing the mark)
while, again, GCC 8.3 would generate the more ideal:
mov rax, rdi
bswap rax
ret
now clang also generates the optimal sequence for this fallback as well.
This is a case where MSVC unfortunately falls short, despite the new
code, this one still generates a doozy of an output.
mov r8, rcx
mov r9, rcx
mov rax, 71776119061217280
mov rdx, r8
and r9, rax
and edx, 65280
mov rax, rcx
shr rax, 16
or r9, rax
mov rax, rcx
shr r9, 16
mov rcx, 280375465082880
and rax, rcx
mov rcx, 1095216660480
or r9, rax
mov rax, r8
and rax, rcx
shr r9, 16
or r9, rax
mov rcx, r8
mov rax, r8
shr r9, 8
shl rax, 16
and ecx, 16711680
or rdx, rax
mov eax, -16777216
and rax, r8
shl rdx, 16
or rdx, rcx
shl rdx, 16
or rax, rdx
shl rax, 8
or rax, r9
ret 0
which is pretty unfortunate.
|
|
Allows the compiler to inform when the result of a swap function is
being ignored (which is 100% a bug in all usage scenarios). We also mark
them noexcept to allow other functions using them to be able to be
marked as noexcept and play nicely with things that potentially inspect
"nothrowability".
|
|
Including every OS' own built-in byte swapping functions is kind of
undesirable, since it adds yet another build path to ensure compilation
succeeds on.
Given we only support clang, GCC, and MSVC for the time being, we can
utilize their built-in functions directly instead of going through the
OS's API functions.
This shrinks the overall code down to just
if (msvc)
use msvc's functions
else if (clang or gcc)
use clang/gcc's builtins
else
use the slow path
|
|
We don't plan to support host 32-bit ARM execution environments, so this
is essentially dead code.
|
|
video_core/texures/texture: Remove unnecessary includes
|
|
file_sys: Provide generic interface for accessing game data
|
|
Correct XMAD mode, psl and high_b on different encodings.
|
|
Port citra-emu/citra#4437: "citra-qt: Make hotkeys configurable via the GUI (Attempt 2)"
|
|
yuzu/loading_screen: Resolve runtime Qt string formatting warnings
|
|
gl_backend: Align Pixel Storage
|
|
Correct LOP_IMM encoding
|
|
We need to ensure dynarmic gets a valid pointer if the page table is
resized (the relevant pointers would be invalidated in this scenario).
In this scenario, the page table can be resized depending on what kind
of address space is specified within the NPDM metadata (if it's
present).
|
|
In our error console, when loading a game, the strings:
QString::arg: Argument missing: "Loading...", 0
QString::arg: Argument missing: "Launching...", 0
would occasionally pop up when the loading screen was running. This was
due to the strings being assumed to have formatting indicators in them,
however only two out of the four strings actually have them.
This only applies the arguments to the strings that have formatting
specifiers provided, which avoids these warnings from occurring.
|
|
This commit makes sure GL reads on the correct pack size for the
respective texture buffer.
|
|
|
|
|
|
shader_cache: Permit a Null Shader in case of a bad host_ptr.
|
|
maxwell_3d: Reduce severity of ProcessSyncPoint
|
|
shader_ir: Implement AOFFI for TEX and TLD4
|
|
core/memory: Minor simplifications to page table management
|
|
gl_state: Rework to enable individual applies
|
|
gl_shader_disk_cache: Use Zstandard for compression
|
|
kernel/{server_port, server_session}: Return pairs instead of tuples from pair creation functions
|
|
core/memory: Remove unused enum constants
|
|
memory_manager: Improved implementation of read/write/copy block.
|
|
|
|
These are holdovers from Citra and can be removed.
|
|
Now that nothing actually touches the internal page table aside from the
memory subsystem itself, we can remove the accessor to it.
|
|
Given the page table will always be guaranteed to be that of whatever
the current process is, we no longer need to keep this around.
|
|
Centralizes the page table switching to one spot, rather than making
calling code deal with it everywhere.
|
|
Keeps the return type consistent with the function name. While we're at
it, we can also reduce the amount of boilerplate involved with handling
these by using structured bindings.
|
|
Returns the same type that the function name describes.
|
|
|
|
Avoids dragging in a direct dependency in a header.
|
|
Nothing in this header relies on common_funcs or the memory manager.
This gets rid of reliance on indirect inclusions in the OpenGL caches.
|
|
Implement SyncPoint Register in the GPU.
|
|
kernel/server_session: Provide a GetName() override
|
|
common/multi_level_queue: Silence truncation warnings
|
|
Port citra-emu/citra#4651: "gdbstub: Fix some bugs in IsMemoryBreak() and ServeBreak. Add workaround to let watchpoints break into GDB."
|
|
video_core/engines: Remove unnecessary inclusions where applicable
|
|
- Fixes graphical issues with Chocobo's Mystery Dungeon EVERY BUDDY!
- Fixes a crash with Mario Tennis Aces
|
|
video_core/memory_manager: Mark a few member functions with the const qualifier
|
|
file_sys/fsmitm_romfsbuild: Utilize a string_view in romfs_calc_path_hash
|
|
core: Add missing override specifiers where applicable
|
|
video_core/gpu_thread: Silence truncation warning in ThreadManager's constructor
|
|
file_sys/nca_metadata: Remove unnecessary comparison operators for TitleType
|
|
service/fsp_srv: Update SaveDataInfo and SaveDataDescriptor structs
|
|
gl_shader_decompiler: Return early when an operation is invalid
|
|
file_sys/program_metadata: Remove obsolete TODOs
|
|
gl_shader_decompiler: Rename GenerateTemporal() to GenerateTemporary()
|