Commit Graph

720 Commits

Author SHA1 Message Date
ece2785582 Introduce ShaderManager with Proxy Shader Compiler Logger/Settings
This class will be entirely responsible for any interop with the shader compiler, it is also responsible for caching and compilation of shaders in itself.
2022-04-14 14:14:52 +05:30
89e9a41a86 Implement VkPipelineViewportStateCreateInfo
"Viewport Transforms" and "Viewport Scissors"  were combined into one section to reflect their state in Vulkan correctly like all other sections.
2022-04-14 14:14:52 +05:30
38119e21d4 Implement Vulkan-Supported Maxwell3D Primitive Topologies
Any primitive topologies that are directly supported by Vulkan were implemented but the rest were not and will be implemented with conversions as they are used by applications, they are:
* LineLoop
* QuadList
* QuadStrip
* Polygon
2022-04-14 14:14:52 +05:30
138f884159 Implement Maxwell3D Vertex Attributes
Translates all Maxwell3D vertex attributes to Vulkan with the exception of `isConstant` which causes the vertex attribute to return a constant value `(0,0,0,X)` which was trivial in OpenGL with `glDisableVertexAttribArray` and `glVertexAttrib4(..., 0, 0, 0, 1)` but we don't have access to this in Vulkan and might need to depend on undefined behavior or manually emulate it in a shader. This'll be revisited in the future after checking host GPU behavior.
2022-04-14 14:14:52 +05:30
c2a6da6431 Implement Maxwell3D Vertex Buffer Limit
Sets the end of VBOs based on the `vertexArrayLimits` register array which provides an IOVA to the end of the VBO.
2022-04-14 14:14:52 +05:30
612f324e78 Implement Maxwell3D Vertex Buffer Instance Rate
Implements the `isVertexInputRatePerInstance` register array which controls if the vertex input rate is either per-vertex or per-instance. This works in conjunction with the vertex attribute divisor for per-instance attribute repetition of attributes.
2022-04-14 14:14:52 +05:30
841ee9fc15 Check for vertexAttributeInstanceRateZeroDivisor feature before usage
Check for `vertexAttributeInstanceRateZeroDivisor` in `VkPhysicalDeviceVertexAttributeDivisorFeaturesEXT` when the Maxwell3D register corresponding to the vertex attribute divisor is set to 0. If it isn't then it logs a warning and sets the value anyway which could result in UB since the only alternative is an exception that stops emulation which might not be optimal if the game mostly works fine without this, we will add a user-facing warning when we intentionally allow UB like this in the future.
2022-04-14 14:14:52 +05:30
c3895a8197 Support VkPhysicalDeviceFeatures2 Extensions
Implement the infrastructure to depend on `VkPhysicalDeviceFeatures2` extended feature structures which can be utilized to retrieve the specifics of features from extensions. It is implemented in the form of `vk::StructureChain` with `vk::PhysicalDeviceFeatures2` that can be extended with any extension feature structures.
2022-04-14 14:14:52 +05:30
ff5515d4d1 Implement Maxwell3D Vertex Buffer Bindings
This implements everything in Maxwell3D vertex buffer bindings including vertex attribute divisors which require the extension `VK_EXT_vertex_attribute_divisor` to emulate them correctly, this has been implemented in the form of of a quirk. It is dynamically enabled/disabled based on if the host GPU supports it and a warning is provided when it is used by the guest but the host GPU doesn't support it.
2022-04-14 14:14:52 +05:30
d163e4ffa6 Introduce IOVA union for flipped words of IOVAs in GraphicsContext
The Maxwell3D `Address` class follows the big-endian register ordering for addresses while on the host we consume them in little-endian, the `IOVA` class is the host equivalent to the `Address` class with implicitly flipped 32-bit register ordering. It shares implicit decomposition semantics from `Address` due to similar requirements with a minor difference of being returned by reference rather than value as we want to have value setting semantics with implicit decomposition while we don't for `Address`.
2022-04-14 14:14:52 +05:30
48d0b41f16 Implement Maxwell3D Common/Independent Color Write Mask
Maxwell3D supports both independent and common color write masks like color blending but for common color write masks rather than having register state specifically for it, the state from RT 0 is extended to all RTs. It should be noted that color write masks are included in blending state for Vulkan while being entirely independent from each other for Maxwell, it forces us to use the `independentBlend` feature even when we are doing common blending unless the color write mask is common as well but to simplify all this logic the feature was made required as it supported by effectively all targeted devices.
2022-04-14 14:14:52 +05:30
92809f8a78 Implement Maxwell3D Independent/Common Color Blending
Maxwell3D supports independent blending which has different blending per-RT and common blending which has the same blending for all RTs. There is a register determining which mode to utilize and we simply have two arrays of `VkPipelineColorBlendAttachmentState` for the RTs that we toggle between to make the transition between them extremely cheap.
2022-04-14 14:14:52 +05:30
cd737fbdd8 Add Required VkDevice Features
A prior commit added the ability to utilize features with quirks but this implements the ability to require a feature be present on the host or an exception will be thrown. It allows us to make useful assumptions that result in a better architecture in certain cases.
2022-04-14 14:14:52 +05:30
081d3277c1 Enable Quirk + Required VkDevice Extensions
Implements the infrastructure required to enable optional extensions set in `QuirkManager` alongside the required extensions in the `GPU` class. All extensions should be correctly resolved now and according to what the device supports.
2022-04-14 14:14:52 +05:30
51659e1329 Enable VkDevice Features Selectively
We selectively enable GPU features that we require as enabling all of them might result in extra driver overhead in certain circumstances. Setting them is handled by `QuirkManager` with the new `FEAT_SET` function that ties a quirk with a feature.
2022-04-14 14:14:52 +05:30
ec378814aa Stub Maxwell3D Alpha Testing
We stub alpha testing as it doesn't exist in Vulkan and few titles use it, it can be emulated in the future using a shader patch with manually discarding fragments failing the alpha test function but this'll be added in later as it isn't high priority at the moment and has associated overhead with it so other options might be explored at the time.
2022-04-14 14:14:52 +05:30
83ec99fa48 Print GPU Quirks At Startup
It is essential to know what quirks a certain GPU may have to debug an issue, these are now printed at startup into the log alongside all other GPU information. A new `QuirkManager::Summary` function was implemented to provide this functionality.
2022-04-14 14:14:52 +05:30
01eb16e59a Implement Maxwell3D Color Logic Operations
Implements a basic part of Vulkan blending state which are color logic operations applied on the framebuffer after running the fragment shader. It is an optional feature in Vulkan and not supported on any mobile GPU vendor aside from ImgTec/NVIDIA by default.
2022-04-14 14:14:52 +05:30
ea87b8878c Implement B8G8R8A8{Unorm/Srgb} RT Format
Needed by Maxwell3D Register Initialization
2022-04-14 14:14:52 +05:30
69e7d8b574 Remove Vulkan to Skyline Format Conversion Function
The function `GetFormat` was seemingly no longer required due to us never converting from a Vulkan format to a Skyline format, most conversions only went from Skyline to Vulkan and were generally lossy due to certain formats being missing in Vulkan and approximated using channel swizzles. As a result of this, it was pointless to maintain and has now been removed.
2022-04-14 14:14:52 +05:30
5ed26fef23 Implement Maxwell3D Rasterizer State
Maxwell3D registers relevant to the Vulkan Rasterizer state have been implemented aside from certain features such as per-face polygon modes that cannot be implemented due to Vulkan limitations. A quirk was utilized to dynamically support the provoking vertex being the last vertex as opposed to the first as well.
2022-04-14 14:14:52 +05:30
8ef225a37d Introduce QuirkManager for runtime GPU quirk tracking
We require a way to track certain host GPU features that are optional such as Vulkan extensions, this is what the `QuirkManager` class does as it checks for all quirks and holds them allowing other components to branch based off these quirks.
2022-04-14 14:14:52 +05:30
8803616673 Reorder GraphicsContext Members
All members are now placed at the start of sections they are relevant to rather than  at the start of the class.
2022-04-14 14:14:52 +05:30
26966287c7 Implement Maxwell3D Shader Program Registers
This commit added basic shader program registers, they simply track the address a shader is pointed to at the moment. No parsing of the shader program is done within them.
2022-04-14 14:14:52 +05:30
5cd1f01690 Refactor all logger calls 2021-11-11 16:13:24 +01:00
ef10d3d394 Use semantic wrapping where appropriate for class initialiser lists 2021-11-10 21:35:36 +05:30
79ceb2cf23 Improve Vulkan Texture Synchronization
The Vulkan Pipeline Barriers were unoptimal and incorrect to some degree prior as we purely synchronized images and not staging buffers. This has now been fixed and improved in general with more relevant synchronization.
2021-11-09 21:08:03 +05:30
9f5ab13858 Implement R16G16{Unorm/Sint/Uint} RT Formats
Utilized in "The Touryst"
2021-10-29 20:21:58 +05:30
afebf77544 Zero-Initialize GuestTexture Members
Members of `GuestTexture` were apparently not being initialized and this led to UB since they would be read as random values. Titles such as Super Mario Odyssey avoided setting `baseArrayLayer` which led to it being left at the default value which was completely random and this would lead to crashes. This commit fixes this by initializing said values correctly.
2021-10-27 22:49:45 +05:30
a0921f8261 Implement R16G16B16A16Snorm/R16G16B16A16Sint RT Formats
Utilized in "The Touryst"
2021-10-27 22:49:45 +05:30
dc3f7f1ab4 Fix Incorrect Scissor Extent
The decomposition from `texture::Dimension` to `vk::Rect2D` was somehow implicit and completely incorrect resulting in wrong conversion with undefined values. It's now been fixed by explicitly setting `vk::Rect2D::extent` to `scissor` specifically.
2021-10-26 20:08:53 +05:30
70d1b4994c Enable Wconversion and fix warnings produced 2021-10-26 11:41:24 +01:00
830a800d9e Consolidate AddAttachment Loops + Rename Renderpass -> RenderPass 2021-10-26 10:46:36 +05:30
ea2626bcc6 Address CR Comments 2021-10-26 10:46:36 +05:30
eff5711c49 Split monolithic common.h into smaller chunks
* Resolves dependency cycles in some components
* Allows for easier navigation of certain components like `span` which were especially large
* Some imports have been moved from `common.h` into their own files due to their infrequency
2021-10-26 10:46:36 +05:30
9b9bf8d300 Introduce ThreadLocal Class + Fix Several GPU Bugs
* Fix `AddClearColorSubpass` bug where it would not generate a `VkCmdNextSubpass` when an attachment clear was utilized
* Fix `AddSubpass` bug where the Depth Stencil texture would not be synced
* Respect `VkCommandPool` external synchronization requirements by making it thread-local with a custom RAII wrapper
* Fix linear RT width calculation as it's provided in terms of bytes rather than format units
* Fix `AllocateStagingBuffer` bug where it would not supply `eTransferDst` as a usage flag
* Fix `AllocateMappedImage` where `VkMemoryPropertyFlags` were not respected resulting in non-`eHostVisible` memory being utilized
* Change feature requirement in `AndroidManifest.xml` to Vulkan 1.1 from OGL 3.1 as this was incorrect
2021-10-16 12:13:30 +01:00
eb25f60033 Implement multichannel support for GPU
Allows the execution of multiple channels at the same time, with locking
being performed on the host GPU scheduler layer, address spaces can be
bound to one or more channels.
2021-10-16 12:13:30 +01:00
b762d1df23 Introduce Texture Always Sync + Wait on GPU Execution + More RT Formats
Infrastructure for always syncing textures has been introduced now, they will be synced prior to and after every execution. This does considerably reduce the performance alongside waiting on GPU execution to finish but it will be partially recouped once conditional syncing is performed.
2021-10-16 12:13:30 +01:00
f8acc1e131 Improve Shared Fonts + Fix AM PopLaunchParameter & Choreographer Bug
* Move Shared Font TTFs to AAsset storage + Support external shared font loading from `/data/data/skyline.emu/data/fonts`
* Fix bug in `IApplicationFunctions::PopLaunchParameter` caused by ignoring `LaunchParameterKind`
* Fix bug with Choreographer causing it to be awoken and exit prior to the destruction of `PresentationEngine`
* Fix bug with `IDirectory::Read` where it used `inputBuf` for the output buffer rather than `outputBuf`
* Improve `GetFunctionStackTrace` logs when `dli_sname` or `dli_fname` are missing
* Support more RT Formats
2021-10-16 12:13:30 +01:00
95a08627e5 Subpass Support + More RT Formats + Fix FenceCycle Cyclic Dependencies
Support for subpasses was added by reworking attachment reuse code to account for preserved attachments and subpass dependencies. A lot of RT formats were also added to allow SMO to boot up entirely, it should be noted that it doesn't render anything. 

`FenceCycle` had a cyclic dependency which broke clean exit, we now utilize `std::weak_ptr<FenceCycle>` inside the `Texture` object. A minor fix for broken stack traces was also made caused by supplying a `nullptr` C-string to libfmt when a symbol was unresolved which caused an `abort` due to invocation of `strlen` with it.
2021-10-16 12:13:30 +01:00
239d2625e2 Introduce CommandExecutor + Implement ClearBuffers + More RT Formats
This commit introduces the `CommandExecutor` which is responsible for creating and orchestrating a Vulkan command graph comprising of `CommandNode`s that construct all the objects required for rendering. As a result of the infrastructure provided by `CommandExecutor`, `ClearBuffers` could be implemented and be appropriately utilized.

A bug regarding scissors was also determined and fixed in the PR, the extent of them were previously inaccurate and this has now been fixed.

Note: We don't synchronize any textures from the guest for now as this would override the contents on the host, this'll be fixed with the appropriate write tracking but it also results in a black screen for anything that writes to FB
2021-10-05 01:13:22 +05:30
3879d573d5 Fix Command Buffer Allocation & FenceCycle
This commit fixes a major issue with command buffer allocation which would result in only being able to utilize a command buffer slot on the 2nd attempt to use it after it's freed, this would lead to a significantly larger amount of command buffers being created than necessary. It also fixes an issue with the command buffers not being reset after they were utilized which results in UB eventually.

Another issue was fixed with `FenceCycle` where all dependencies are only destroyed on destruction of the `FenceCycle` itself rather than the function where the `VkFence` was found to be signalled.
2021-10-05 01:13:22 +05:30
bee28aaf0d Validation Layer Filter + Fix Texture, GPU & PresentationEngine bugs
This commit implements a filter by type for any validation layer output, this allows filtering out any logs which may be unnecessary and additionally triggering a breakpoint as required. 

An issue concerning the `NDEBUG` flag never being set was fixed, it's now supplied as a release compiler flag. The issue can manifest itself by always relying on a validation layer even though it shouldn't on release, this is why the validation layer was mistakenly disabled entirely previously by using `#ifndef` rather than `#ifdef`.

An issue with the initial layout of a texture being supplied as neither `VK_IMAGE_LAYOUT_UNDEFINED` or `VK_IMAGE_LAYOUT_PREINITIALIZED` was fixed, these cases are now handled by transitioning to those layouts after creating the image rather than supplying it within `initialLayout`.

Another issue was fixed regarding not maintaining a transformation after a surface has been destroyed and recreated existed and manifested itself when the user would go out of the app and come back in, they would see the surface having an identity transformation rather than the desired one.
2021-10-05 01:13:22 +05:30
54908afc44 Texture GMMU Address Resolution + Refactor Maxwell3D::CallMethod
Fixes bugs with the Texture Manager lookup, fixes `RenderTarget` address extraction (`low`/`high` were flipped prior), refactors `Maxwell3D::CallMethod` to utilize a case for the register being modified + preventing redundant method calls when no new value is being written to the register, and fixes the behavior of shadow RAM which was broken previously and would lead to incorrect arguments being utilized for methods.
2021-10-05 01:13:22 +05:30
270f2db1d2 Initial Texture Manager Implementation + Maxwell3D Render Target
Implement the groundwork for the texture manager to be able to report basic overlaps and be extended to support more in the future. The Maxwell3D registers `RenderTargetControl`, `RenderTarget` and a stub for `ClearBuffers` were implemented. 

A lot of changes were also made to `GuestTexture`/`Texture` for supporting mipmapping and multiple array layers alongside significant architectural changes to `GuestTexture` effectively disconnecting it from `Texture` with it no longer being a parent rather an object that can be used to create a `Texture` object.

Note: Support for fragmented CPU mappings hasn't been added for texture synchronization yet
2021-10-05 01:13:22 +05:30
8cba1edf6d Introduce Boost as a submodule + Minor Fixes
Utilize Boost Container's `small_vector` for optimizing allocations, fix certain implicit casting issues and make `ILogger` not output an additional newline in the log when the application supplies one at the end of the log
2021-10-05 01:13:22 +05:30
190fde110f Introduce GraphicContext and Implement Viewport Transform + Scissors 2021-10-05 01:13:22 +05:30
ec1da1b3f5 Address CR Comments + Increase minSdkVersion to 29 (Android 10) 2021-07-12 21:27:49 +05:30
19be9f4b66 Support Sticky Transforms + Minor Formatting Fixes
Sticky transforms have been stubbed, as they are on HOS/Android. Certain titles like Xenoblade Chronicles end up setting the sticky transform even if it doesn't do anything, as a result of this we cannot throw an exception for it and stub it without an exception (Aside from the cases where the value isn't recognized).
2021-07-12 21:27:49 +05:30
6595670a5d Rework Performance Statistics
We used instantaneous values for FPS previously which led to a lot of variation in it and the inability to determine a proper FPS value due to constant fluctuations. All FPS values are now averaged to allow reading out a stable value and a deviation statistic has been added for the frame-time to judge judder and frame-pacing which allows for a significantly better measure of overall performance. The formatting for all the floating-point numbers is now fixed-point to prevent shifting of position due to decimal digits becoming 0.
2021-07-12 21:27:49 +05:30