✨ RFC-HDFG-2026-001: String-based filter configuration API#6470
Open
brtnfld wants to merge 15 commits into
Open
✨ RFC-HDFG-2026-001: String-based filter configuration API#6470brtnfld wants to merge 15 commits into
brtnfld wants to merge 15 commits into
Conversation
Contributor
Review ChecklistThis PR touches the following areas. Each needs a sign-off
|
hyoklee
requested changes
Jun 19, 2026
hyoklee
previously approved these changes
Jun 20, 2026
Adds a human-readable key=value parameter string API for HDF5 filters, alongside the existing integer cd_values arrays. New C API: - H5Pappend_filter(plist, filter_id, flags, params) — appends a filter using either a key=value string or raw cd_values (H5Z_params_t) - H5Pget_filter_params_by_idx(plist, idx, buf, buf_size, content_len) — retrieves the parameter string for a filter by pipeline index - H5Zconfig_get_int/double/bool/str — typed accessors for use inside filter set_config callbacks - H5Z_filter_id_by_name(name) — look up a filter id by registered name - H5Zget_filter_info2(id, info) — extended filter info including v3 fields New H5Z_class3_t fields: name, description, set_config, get_config, and reserved blob-callback placeholders (write_blob/read_blob/close_blob). H5Z_pipeline gains dxpl_id, scaled[], and ndims arguments threaded through from all call sites so v3 filter callbacks have full context. All six built-in filters (deflate, shuffle, fletcher32, nbit, szip, scaleoffset) implement set_config/get_config callbacks. TOML subset parser: tomlc17 (MIT) vendored in src/tomlc17/ and compiled unconditionally into libhdf5. Hex-float literals are transparently rewritten to decimal before parsing. tomlc17 symbols are hidden via -fvisibility=hidden to prevent namespace collisions. On-disk format: no new pipeline version. Parameter strings are converted to cd_values by set_config at H5Pappend_filter time and stored using the existing v2 pipeline message. On read, get_config reconstructs the string. Full backward read compatibility is preserved. Fortran, C++, and Java bindings added. Tests in test/tfilter2.c (~2300 lines) and testpar/t_filters_parallel.c (par-01–par-04). h5dump displays filter parameter strings; h5repack accepts TOML-form UD= filter specs. Code-review fixes included: tomlc17 visibility, H5Pget_filter_params_by_idx arg validation and true-length two-pass contract, flags re-validation after set_config, H5Z_register3 runtime plugin validation, Java two-pass protocol and h5libraryError() consistency, CHANGELOG corrections. Fixes GitHub issue HDFGroup#6153
…ib absent Two bugs caused H5TEST-tfilter2 to fail on BSD and non-zlib CI builds: 1. H5Zconfig.c: The review changed strlen(params) > H5Z_CONFIG_STRING_MAX to >=, which incorrectly rejected strings of exactly H5Z_CONFIG_STRING_MAX bytes. The public contract (and test_config_string_max_boundary) accept strings up to and including H5Z_CONFIG_STRING_MAX characters. Reverted to >. 2. test/tfilter2.c: test_config_string_max_boundary called H5Pappend_filter with H5Z_FILTER_DEFLATE without first checking availability. On CI builds without zlib the filter lookup fails with "filter not found" before the length check runs. Added H5Zfilter_avail guard; the test is now SKIPPED on non-zlib builds.
-p was printing PARAMS_STRING and DESCRIPTION as flat siblings of
FILTERS{} rather than inside the entry of the filter they describe,
which is ambiguous (or outright misleading) once more than one filter
is applied to a dataset. Move them inside each filter's own block,
opening one for SHUFFLE/FLETCHER32/NBIT only when there's something to
attach.
Update the affected h5dump DDL fixtures and add DDLBNF220.dox
documenting the new <filter_extra> nesting, bumping the \ref DDLBNF200
cross-references to DDLBNF220.
Two test failures from the h5dump nesting fix (2604f5d): 1. tools/test/h5repack/expected/deflate_limit.h5repack_layout.h5.ddl and h5repack_layout.h5-plugin_test.ddl had PARAMS_STRING (and DESCRIPTION) as flat siblings of the filter block rather than nested inside it — the h5dump fixture files were updated in 2604f5d but these two h5repack expected files were missed. 2. java/src-jni/jni/h5zImp.c called CALL_CONSTRUCTOR with a 6-arg array and signature "(IILjava/lang/String;Ljava/lang/String;ZZ)V" but H5Z_class_info_t's constructor takes 7 args (adds has_blob_callbacks). GetMethodID failed at runtime with the wrong arity. Add args[6] = JNI_FALSE and update the descriptor to ZZZ)V.
- java/test/TestH5Z.java (FFM): fix wrong method names
- H5Pget_filter2 (private) -> H5Pget_filter (public), pass new int[1]
instead of null for filter_config
- H5Zconfig_get_int/double/bool/str -> H5Zconfig_get_param (overloaded)
- Update assertion messages to match corrected method names
- release_docs/CHANGELOG.md: replace em-dashes with hyphens per style
- tools/src/h5repack/h5repack_parse.c: restructure UD= legacy numeric
loop from for-loop to while-loop to avoid CodeQL "loop counter modified
in body" warning; add bounds guard after comma-skip u++
…rocedures The RFC renamed the PRIVATE generic dispatch procedures from h5zget_filter_info1_f/h5zget_filter_info2_f to h5zget_filter_info_flags_f/h5zget_filter_info_class_f to reflect their actual roles. Update hdf5_fortrandll.def.in to export the new mangled names so Intel ifx on Windows links successfully.
H5Pappend_filter is always available in this version and is the standard way to detect the string-based filter config API. The TOMLC17 macro is redundant.
…okup SymbolLookup.loaderLookup() only finds symbols loaded via System.loadLibrary(); jextract loads libhdf5 via its own mechanism so loaderLookup() never finds the RFC symbols at runtime. Replace all SymbolLookup.loaderLookup()+MethodHandle patterns in the RFC methods with direct hdf5_h.* calls, consistent with every other method in H5.java: H5Pappend_filter (both overloads) H5Pget_filter_params_by_idx H5Zget_filter_info2 H5Zconfig_has_key H5Zconfig_get_param (long[], double[], boolean[], String[])
H5Pappend_filter with CDVALUES is documented as identical to H5Pset_filter. Routing through H5Pset_filter avoids constructing an H5Z_params_t struct manually in FFM heap memory, which was silently producing cd_nelmts=0.
Three bugs in H5Pget_filter2: 1. cd_nelmts_segment was allocated as JAVA_INT (4 bytes) but size_t* needs 8 bytes on 64-bit — the C write overflowed into cd_values_segment. 2. cd_nelmts[0] was read back from cd_values_segment (wrong) instead of cd_nelmts_segment. 3. cd_values and flags were never copied back from their native segments. Fix: allocate cd_nelmts_segment as JAVA_LONG, seed it with the caller's capacity on input, and copy all three output arrays back correctly.
…thods The FFM TestH5Z.java gained 8 new test methods covering the new filter string-config API (H5Pappend_filter, H5Pget_filter_params_by_idx, H5Zconfig_get_param_*), but the expected-output reference file was left at 5 tests, causing the JUnit-TestH5Z CTest comparison to fail.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
cd_valuesarrays:H5Pappend_filter,H5Pget_filter_params_by_idx,H5Zconfig_get_int/double/bool/str,H5Z_filter_id_by_name,H5Zget_filter_info2.H5Z_class3_twithname,description,set_config/get_configcallbacks (plus reserved blob-callback placeholders), and threadsdxpl_id/scaled[]/ndimsthroughH5Z_pipelineso v3 filter callbacks have full context. All six built-in filters (deflate, shuffle, fletcher32, nbit, szip, scaleoffset) implementset_config/get_config.src/tomlc17/, built unconditionally into libhdf5 with hidden visibility to avoid symbol collisions.cd_valuesatH5Pappend_filtertime and stored in the existing v2 pipeline message, so read compatibility with older files/libraries is preserved.h5dump/h5repacksupport for the new parameter strings.Fixes #6153.