[MMAP] Add MMapAllocator with file-backed memory mapping support#335
Open
rfsaliev wants to merge 3 commits into
Open
[MMAP] Add MMapAllocator with file-backed memory mapping support#335rfsaliev wants to merge 3 commits into
rfsaliev wants to merge 3 commits into
Conversation
b70af3a to
4ca9f64
Compare
Contributor
There was a problem hiding this comment.
Pull request overview
Adds a new file-backed memory-mapped allocator to the SVS public C++ API to support large, file-backed data/graph storage and provides tests validating it works as a drop-in allocator for existing containers. Also includes a small fix to graph equality and copyright year updates.
Changes:
- Introduces
svs::MMapAllocator<T>and a process-widedetail::MMapAllocationRegistryfor tracking/evicting active mmaps. - Adds Catch2 coverage for allocation behavior and integration with
SimpleData,BlockedData, andSimpleGraph. - Fixes
graphs_equalto usen_nodes()consistently with the graph concepts.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
include/svs/core/allocator.h |
Adds MMapAllocator, allocation registry, access-hint plumbing, and eviction support. |
tests/svs/core/allocator.cpp |
Adds integration/behavior tests for MMapAllocator across data and graph types. |
include/svs/concepts/graph.h |
Fixes graphs_equal to use n_nodes() instead of num_nodes(). |
Comments suppressed due to low confidence (5)
include/svs/core/allocator.h:918
MMapAllocatorstores the registry as a reference member (allocation_resource_), which makes the allocator type not CopyAssignable/MoveAssignable (reference members delete assignment). Standard-library containers and the configuredpropagate_on_container_* = truetraits can require allocator assignment, so this can break compilation/behavior. Consider storing a pointer orstd::reference_wrapperinstead (or drop the propagate traits) so the allocator satisfies the standard allocator requirements.
std::filesystem::path base_path_;
MMapAccessHint access_hint_ = MMapAccessHint::Normal;
detail::MMapAllocationRegistry& allocation_resource_;
public:
// C++ allocator type aliases
using value_type = T;
using propagate_on_container_copy_assignment = std::true_type;
using propagate_on_container_move_assignment = std::true_type;
using propagate_on_container_swap = std::true_type;
include/svs/core/allocator.h:973
- The file name generation is not guaranteed unique across allocator lifetimes or process runs (it uses
allocation_count()which can decrease after deallocation, and files are not removed). Combined withMemoryMapper::MayCreate, a collision will silently reuse an existing file (and potentially stale contents) instead of creating a fresh backing file as documented. Use a uniqueness mechanism that checks for existence (e.g., monotonic counter that never decrements, UUID/mkstemp-style), and/or switch mapping toMustCreatewith retry on collision.
[[nodiscard]] T* allocate(size_t n) {
size_t bytes = sizeof(T) * n;
auto file_path = generate_file_path(bytes);
void* ptr = allocation_resource_.allocate(bytes, file_path);
detail::apply_mmap_access_hint(ptr, bytes, access_hint_);
return static_cast<T*>(ptr);
include/svs/core/allocator.h:901
- This public API doc references
MMapFileViewAllocator, but no such type exists in the repository. Please either add the referenced allocator or update the documentation to point to the actual intended API (e.g., the existingMemoryMapper/registry helpers).
/// Each allocate() call creates a fresh temp file under @c base_path_ and
/// returns a writable mmap of that file. Intended for storing data that is
/// produced at runtime (e.g. an index's secondary, full-dimension dataset)
/// in file-backed pages instead of anonymous RAM.
///
/// For zero-copy loading from a pre-existing file, use
/// @ref MMapFileViewAllocator instead.
///
include/svs/core/allocator.h:728
MMapAllocationRegistry::allocateis documented as allocating into a "freshly created (or extended) file", but it usesMemoryMapper::MayCreate, which will refuse to resize an existing file (and will not extend it). Either change the policy/implementation to actually extend/truncate as intended, or update the documentation to reflect the real behavior.
///
/// @brief Allocate memory mapped to a freshly created (or extended) file.
///
/// @param bytes Number of bytes to allocate
/// @param file_path Path to the file for backing storage
/// @return Pointer to the allocated memory
///
[[nodiscard]] void* allocate(size_t bytes, const std::filesystem::path& file_path) {
MemoryMapper mapper{MemoryMapper::ReadWrite, MemoryMapper::MayCreate};
auto mmap_ptr = mapper.mmap(file_path, lib::Bytes(bytes));
include/svs/core/allocator.h:934
- The constructor exposes
detail::MMapAllocationRegistry&as a public parameter, which leaks an internaldetailtype into the public API surface. If this is only for testing or internal wiring, consider removing it from the public constructor (always use the singleton internally), or provide a separate internal/testing-only constructor to keep the public API cleaner.
explicit MMapAllocator(
std::filesystem::path base_path = std::filesystem::temp_directory_path(),
MMapAccessHint access_hint = MMapAccessHint::Normal,
detail::MMapAllocationRegistry& allocation_resource =
detail::MMapAllocationRegistry::instance()
)
a751513 to
5f4ac05
Compare
Co-authored-by: Ishwar Bhati <ishwar.s.bhati@intel.com>
5f4ac05 to
66b6817
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This pull request introduces a new file-backed, memory-mapped allocator (
MMapAllocator) to the codebase, along with a supporting allocation registry and extensive tests. The changes enable efficient, zero-copy data and graph storage using memory-mapped files, and provide mechanisms for cache control and access pattern hints. Additional updates include minor copyright year adjustments and fixes to graph comparing routine.The most important changes are:
New Memory-mapped Allocator and Registry:
MMapAllocatortemplate class tosvs/core/allocator.h, allowing allocation of memory backed by files, supporting efficient large dataset and graph storage. Includes support for access pattern hints and cache eviction.MMapAllocationRegistrysingleton for tracking and managing all active memory-mapped allocations, including thread-safe allocation, deallocation, and page eviction.MMapAccessHintenum and helper to set madvise hints for memory-mapped regions, improving performance for different access patterns.Testing and Validation:
MMapAllocatorintests/svs/core/allocator.cpp, covering allocation/deallocation, rebinding, equality, integration withSimpleData,BlockedData, andSimpleGraph, and file-backed correctness.Graph and Data Integration:
MMapAllocatorworks as a drop-in replacement for existing data and graph containers, demonstrating compatibility and correctness. [1] [2]Minor Fixes:
graphs_equalby replacingnum_nodes()withn_nodes()to match the correct API.