Redesign buffer view infra to remarkably reduce creation overhead

Buffer views creation was a significant pain point, requiring several layers of caching to reduce the number of creations that introduced a lot of complexity. By reworking delegates to be per-buffer rather than per-view and then linearly allocating delegates (without ever freeing) views can be reduced to just {delegatePtr, offset, size}, avoiding the need for any allocations or set operations in GetView. The one difficulty with this is the need to support buffer recreation, which is achived by allowing delegates to be chained - during recreation all source buffers have their delegates modified to point to the newly created buffer's delegate. Upon accessing a view with such a chained delegate the view will be modified to point directly to the end delegate with offset being updated accordingly, skipping the need to traverse the chain for future accesses.
This commit is contained in:
Billy Laws
2022-08-31 14:15:56 +01:00
parent 09f376e500
commit 5dca5cc10e
6 changed files with 207 additions and 249 deletions

View File

@ -3,6 +3,7 @@
#pragma once
#include <common/linear_allocator.h>
#include <common/segment_table.h>
#include "buffer.h"
@ -15,6 +16,8 @@ namespace skyline::gpu {
GPU &gpu;
std::mutex mutex; //!< Synchronizes access to the buffer mappings
std::vector<std::shared_ptr<Buffer>> bufferMappings; //!< A sorted vector of all buffer mappings
LinearAllocatorState<> delegateAllocatorState; //!< Linear allocator used to allocate buffer delegates
size_t nextBufferId{}; //!< The next unique buffer id to be assigned
static constexpr size_t L2EntryGranularity{19}; //!< The amount of AS (in bytes) a single L2 PTE covers (512 KiB == 1 << 19)
SegmentTable<Buffer *, constant::AddressSpaceSize, constant::PageSizeBits, L2EntryGranularity> bufferTable; //!< A page table of all buffer mappings for O(1) lookups on full matches