Commit Graph

720 Commits

Author SHA1 Message Date
dd92cb1536 Implement support for (de)serialising VkPipelineCaches to/from storage
Significantly improves launch times in games with many shader combinations, giving an 5x speedup in some cases.
2023-02-04 23:10:45 +00:00
8b9d6f79ab Add option to enable/disable shader cache 2023-01-28 11:57:19 +00:00
e2463b7619 Adjust gpfifo WFI to only do a pipeline barrier 2023-01-20 21:07:59 +00:00
2b282ece1a Add more fine-grained buffer recreation locking 2023-01-20 21:07:59 +00:00
a8b32c3cef Cleanup helper pipeline cache code 2023-01-20 21:07:59 +00:00
1f99d63a80 Incr transition cache size 2023-01-20 21:07:59 +00:00
0a608fb4b2 Update to latest hades 2023-01-20 21:07:59 +00:00
44f6aada18 Always set blend state for all colour attachments 2023-01-20 21:07:59 +00:00
177925be93 Avoid OOB memory acceses when trying to read OOB TICs
Some games pass in invalid texture handles (0xffff) when they don't need the texture so return the null texture in this case.
2023-01-20 21:07:59 +00:00
d8a4a2b08d Use a spinlock for GPU waiter thread 2023-01-20 21:07:59 +00:00
f1aed86177 Add a workaround for split-mapping shaders
Some games split shaders across multiple mappings and *also* miss the end header, so read a suitably large amount and hope that's enough for now.
2023-01-20 21:07:59 +00:00
704660bbeb Store render nodes in a linearly allocated linked list
This is much faster in reldebug builds than boost::stable_vector while still providing iterator stability
2023-01-20 21:07:59 +00:00
326c05a5de Add guest shader replacement and dumping support 2023-01-20 21:07:59 +00:00
ea0217de47 Add TIC format: 0x78D24952 2023-01-13 18:05:22 +00:00
950438bf58 Enable VK_KHR_image_format_list during device init
`VK_KHR_image_format_list` is a requirement for `VK_KHR_imageless_framebuffer`, which we use.
2023-01-11 23:38:57 +05:30
a92c26531e Keep holes in descriptors for unsupported bindings 2023-01-08 19:30:52 +00:00
3e5992e366 Update hades 2023-01-08 19:30:52 +00:00
45bbf3bb2a Fix indirect draws with direct buffers
We need to wait on the GPFIFO manually as we won't hit the traps when accesing the indirect params with direct as we usually would.
2023-01-08 19:30:52 +00:00
68ad052cb1 Add geometry passthrough shader support for vertex layer writes 2023-01-08 19:30:52 +00:00
ec519a7d52 Return null texture on encountering unmapped textures 2023-01-08 19:30:52 +00:00
97e127153b Make shader trap mutex recursive
There are cases there we hit a shader trap within the GPU, by making it recursive we avoid deadlocking on reads within the GPU.
2023-01-08 19:30:52 +00:00
1a6165f74d Fix GetReadOnlyBackingSpan for non-direct buffers
This was missed in the initial implementation
2023-01-08 19:30:52 +00:00
35a46acbb1 Determine storage buffer alignment dynamically 2023-01-08 19:30:52 +00:00
28b2a7a8a1 Dynamically apply GPU turbo clocks only when GPU submissions are queued
Allows for the GPU to clock down in cases where it's idle for most of the time, while still forcing maximum clocks when we care.
2023-01-08 19:30:52 +00:00
81f3ff348c Transition memory handling from memfd to anonymous shared mappings
Memfd mappings are incompatible with KGSL user memory importing on older kernels, transition to shared anon mappings to avoid this.
2023-01-08 19:30:52 +00:00
cc3c869b9f Attempt to signal the vsync event at present time if possible
Some games rely on the vsync event to schedule frames, by matching its timing with presentation we can reduce needless waiting as the game will immediely be able to queue the next frame after presentation.
2023-01-08 19:30:52 +00:00
afef6c5123 Always populate all colour attachments
This better follow the Vulkan spec, which doesn't mention anything about writes to OOB attachments, only those marked as unused.
2023-01-08 19:30:52 +00:00
3571737392 Reset maxwell3d quick bind state before adding subpasses to executor
If a submission happens during the call to addsubpass we could end up with invalid quick bind state, move this to to before to prevent that.
2023-01-08 19:30:52 +00:00
3d31ade35f Implement an alternative buffer path using direct memory importing
By importing guest memory directly onto the host GPU we can avoid many of the complexities that occur with memory tracking as well as the heavy performance overhead in some situations. Since it's still desired to support the traditional buffer method, as it's faster in some cases and more widely supported, most of the exposed buffer methods have been split into two variants with just a small amount of shared code. While in most cases the code is simpler, one area with more complexity is handling CPU accesses that need to be sequenced, since we don't have any place we can easily apply writes to on the GPFIFO thread that wont also impact the buffer on the GPU, to solve this, when the GPU is actively using a buffer's contents, an interval list is used to keep track of any GPFIO-written regions on the CPU and any CPU reads to them will instead be directed to a shadow of the buffer with just those writes applied. Once the GPU has finished using buffer contents the shadow can then be removed as all writes will have been done by the GPU.

The main caveat of this is that it requires tying host sync to guest sync, this can reduce performance in games which double buffer command buffers as it prevents us from fully saturating the CPU with the GPFIFO thread.
2023-01-08 19:30:52 +00:00
b3f7e990cc Allow for tying guest GPU sync operations to host GPU sync
This is necessary for the upcoming direct buffer support, as in order to use guest buffers directly without trapping we need to recreate any guest GPU sync on the host GPU. This avoids the guest thinking work is done that isn't and overwriting in-use buffer contents.
2023-01-08 19:30:52 +00:00
89c6fab1cb Implement a way to check if the command record thread is idle
Useful for debugging and testing
2023-01-08 19:30:52 +00:00
c67f27e914 Add a setting to control the maximum number of accumulated GPU cmds
This helps to keep the GPU fed when processing large command buffers which don't have any syncpoints to force a flush inbetween.
2023-01-08 19:30:52 +00:00
3ecaedd71e Add adrenotools direct mapping support 2023-01-08 19:30:52 +00:00
bab659587f Use e1 sample count for blits 2022-12-22 18:05:45 +00:00
516ece6b04 Calculate renderarea from attachment min size 2022-12-22 18:05:45 +00:00
4a3cd69257 Populate graphics pipeline manager from cache at launch-time 2022-12-22 18:05:45 +00:00
e9bcdd06eb Introduce a pipeline cache manager for simple read/write cache accesses
All writes are done async into a staging file, which is then merged into the main pipeline cache file at the time of the next launch. Upon encountering file corruption the cache can be trimmed up to the last-known-good entry to avoid any excessive loss of data from just one error.
2022-12-22 18:05:45 +00:00
06bf1b38af Introduce a pipeline state accessor that reads from a bundle 2022-12-22 18:05:45 +00:00
7dd3a1db0f Avoid InterconnectContext use in graphics PipelineManager
We will soon move to a global pipeline manager instance, so it wont be possible to use InterconnectContext at pipeline-creation time anymore
2022-12-22 18:05:45 +00:00
ffe7263848 Add quirk for 615 drivers with broken multithreaded compilation 2022-12-22 18:05:45 +00:00
755f7c75af Add pipeline (de)serialisation support to bundle
See comments in code for details on the on-disk format.
2022-12-22 18:05:45 +00:00
937eff392f Switch execution-numbers to be globally unique tags
This is required for making pipelines usable across channels without introducing caching bugs.
2022-12-22 18:05:45 +00:00
072b8193a1 Implement thread pool based async pipeline compilation with futures
By distributing the load of shader compiling onto multiple threads and then only waiting for completion until absolutely neccessary we can reduce compilation stutters significantly.
2022-12-22 18:05:45 +00:00
186549748d Implement HelperShader-local pipeline cache and use dynamic state
Avoids the heavy overhead of the VK pipeline cache when we really only have a few bits of non-dynamic state
2022-12-22 18:05:45 +00:00
9115b8cae8 Properly hash dynamic states in pipeline cache 2022-12-22 18:05:45 +00:00
7c4b4765bf Reduce thresholds for slot increase and buffer/texture fast readback 2022-12-22 18:05:45 +00:00
ce428af2e6 Use attachment formats rather than views in VK pipeline cache 2022-12-22 18:05:45 +00:00
e849264028 Abstract out pipeline-compile-time GPU state accesses
Introduces the base abstractions that will be used for pipeline caching, with a 'PipelineStateBundle' that can be (de)serialised to/from disk and an abstract accessor class to allow switching between creating disk-cached pipelines and fresh ones.
2022-12-22 18:05:45 +00:00
2e96248fb6 Track RT format info in PackedPipelineState and move VK conv code there
When caching pipelines we can't cache whole images, only their formats so refactor PackedPipelineState so that it can be used for pipeline creation, as opposed to passing in a list of attachments.
2022-12-22 18:05:45 +00:00
bc7e1eb380 Split-out hash from ShaderBinary struct
This isn't necessary for pipeline creation and creates some difficulty with pipeline caching.
2022-12-22 18:05:45 +00:00