Age | Commit message (Collapse) | Author |
|
Due to BindBufferRangeNV limitations and poor quality code emission from
our side, assembly shaders are currently slower than GLSL. Their build
time and feature advantages are still relevant, but they are outweighted
by their runtime performance.
|
|
Remove unused interop code from the OpenGL backend.
|
|
Sacrify runtime performance to avoid generating kernel exceptions on
Windows due to our abusive aliasing of interop buffer objects.
|
|
Detect when a memory region has been joined several times and increase
the size of the created buffer on those instances. The buffer is assumed
to be a "stream buffer", increasing its size should stop us from
constantly recreating it and fragmenting memory.
|
|
Allow adding functionality to each function without making CreateBuffer
more complex.
|
|
Ports from OpenGL the optimization to skip small 3D uniform buffer
uploads. This will take advantage of the previously introduced stream
buffer.
Fixes instances where the staging buffer offset was being ignored.
|
|
This uses a ring buffer similar to OpenGL's stream buffer for small
uploads. This stops us from allocating several small buffers, reducing
memory fragmentation and cache locality.
It uses dedicated allocations when possible.
|
|
Fix regression on Pascal on Animal Crossing: New Horizons, fixing a
validation error.
|
|
Reimplement the buffer cache using cached bindings and page level
granularity for modification tracking. This also drops the usage of
shared pointers and virtual functions from the cache.
- Bindings are cached, allowing to skip work when the game changes few
bits between draws.
- OpenGL Assembly shaders no longer copy when a region has been modified
from the GPU to emulate constant buffers, instead GL_EXT_memory_object
is used to alias sub-buffers within the same allocation.
- OpenGL Assembly shaders stream constant buffer data using
glProgramBufferParametersIuivNV, from NV_parameter_buffer_object. In
theory this should save one hash table resolve inside the driver
compared to glBufferSubData.
- A new OpenGL stream buffer is implemented based on fences for drivers
that are not Nvidia's proprietary, due to their low performance on
partial glBufferSubData calls synchronized with 3D rendering (that
some games use a lot).
- Most optimizations are shared between APIs now, allowing Vulkan to
cache more bindings than before, skipping unnecesarry work.
This commit adds the necessary infrastructure to use Vulkan object from
OpenGL. Overall, it improves performance and fixes some bugs present on
the old cache. There are still some edge cases hit by some games that
harm performance on some vendors, this are planned to be fixed in later
commits.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Workaround an issue on Nvidia where creating a Vulkan instance from an
active OpenGL thread disables threaded optimization on the driver.
This optimization is important to have good performance on Nvidia
OpenGL.
|
|
|
|
Instead of using a two step initialization to report errors, initialize
the GPU renderer and rasterizer on the constructor and report errors
through std::runtime_error.
|
|
Ensure the behavior of the previous commit in tests.
|
|
Some games usually write memory pages currently used by the GPU, causing
rendering issues (e.g. flashing geometry and shadows on Link's
Awakening). To workaround this issue, Guest CPU writes are delayed until
the command buffer finishes processing, but the pages are updated
immediately.
The overall behavior is:
- CPU writes are cached until they are flushed, they update the page
state, but don't change the modification state. Cached writes stop
pages from being flushed, in case games have meaningful data in it.
- Command processing writes (e.g. push constants) update the page state
and are marked to the command processor as dirty. They don't remove
the state of cached writes.
|
|
kernel: More accurately utilize resource_limit
|
|
|
|
This implements KScopedReservation, allowing resource limit reservations to be more HW accurate, and release upon failure without requiring too many conditionals.
|
|
* kernel: Unify result codes
Drop the usage of ERR_NAME convention in kernel for ResultName. Removed seperation between svc_results.h & errors.h as we mainly include both most of the time anyways.
* oops
* rename errors to svc_results
|
|
core: Silence various warnings on Clang 12
|
|
input_common: Add mouse panning
|
|
software_keyboard: Implement Finalize request command
|
|
|
|
configure_input_player_widget: Minor cleanup
|
|
common: Add -fsized-deallocation as a Clang flag
|
|
core: Add -fsized-dealloction as a Clang flag
|
|
configure_input_player_widget: Silence unused variable warnings
|
|
Prevents clang 11 from throwing an error since these variables are
unused.
|
|
Prevents an operator delete error when compiling with Clang 11.
|
|
Prevents a operator delete error when compiling with Clang 11.
|
|
udp: Silence warnings on Clang 12
|
|
video_core: Remove unused functions and variables
|
|
Clang 12 currently falls over in the face of this.
|
|
Prevents warnings on clang 12. This path is reachable on other
variations of the build that disable the unreachable macro.
|
|
We were previously the name of the object being initialized within its
own initializer, which results in uninitialized data being read.
|
|
Simply mark them as unused for now.
|
|
Prevents compilation errors on clang 12 due to incomplete types within a
unique_ptr member.
|
|
Resolves warnings on clang 12
|
|
Silences a few warnings on clang 12.
|
|
applicable
Reduces the amount of code to read in expressions a little bit by
separating constituents out a little.
|
|
Previously a function was copying an array of 20 std::string instances
by value.
|
|
* Add some depth to ProJoysticks
* address comments
* clang
* address nits
* fix wrong inner_offset when offset.x was 0
|
|
cmake: Revert FFmpeg 4.3.1 update for Windows builds
|
|
The new 4.3.1 externals build seems to not be compatible with yuzu. This also fixes an oversight when renaming CMake variables.
|