Skip to content

docs: hidden attributes are platform-only; clarify users cannot declare them#162

Merged
MilagrosMarin merged 6 commits intomainfrom
docs/hidden-attributes-platform-framing
Apr 30, 2026
Merged

docs: hidden attributes are platform-only; clarify users cannot declare them#162
MilagrosMarin merged 6 commits intomainfrom
docs/hidden-attributes-platform-framing

Conversation

@dimitri-yatsenko
Copy link
Copy Markdown
Member

@dimitri-yatsenko dimitri-yatsenko commented Apr 29, 2026

Summary

Reworks reference/specs/table-declaration.md §3.4 to reflect the design decision in datajoint/datajoint-python#1441: user-defined hidden attributes are not supported, and the parser now returns a clear `DataJointError` (instead of a cryptic `pyparsing.ParseException`) when a definition uses a leading-underscore name.

What changed in §3.4

  • Lead paragraph states up-front that user-defined hidden attributes are not supported, and shows the exact error message users will see.
  • Platform-managed table (_job_start_time, _job_duration, _job_version, _singleton) is preserved — these are the actual hidden columns users encounter in fetch results, joins, and describe output.
  • Behavior matrix is preserved — it's still the canonical reference for how platform-managed hidden columns interact with every public API surface.
  • "Why users can't declare them" paragraph added: no public-API write path, no describe() round-trip, silent filtering on dict restrictions and insert(ignore_extra_fields=True).
  • The previous "User-defined hidden attributes" subsection and the _params_hash-as-hidden example are removed. Replaced with a regular-attribute example showing the recommended pattern: declare params_hash as a regular column, use proj() at the call site if visibility control is needed.

Companion code PR

datajoint/datajoint-python#1441 — replaces the cryptic `pyparsing.ParseException` with the helpful `DataJointError` and adds a unit test asserting that `compile_attribute` rejects leading-underscore names with the new message.

Test plan

  • `python scripts/gen_llms_full.py` regenerates cleanly.
  • Build the docs site locally and confirm §3.4 renders correctly.

Hidden attributes (names starting with `_`) were primarily designed for
platform operations — DataJoint itself uses them for `_job_start_time`,
`_job_duration`, `_job_version` on Computed/Imported tables and for the
`_singleton` implementation detail. Some functionality is intentionally
exposed to users (notably: a unique index can reference a hidden column,
making `_params_hash`-style derived columns useful), but the feature is
not intended as a general column-hiding tool.

Reframe section 3.4 around that intent, and replace the previous
behavior table with a verified one drawn from the actual code paths:

- Distinguishes platform-managed (auto-injected) from user-defined.
- Documents the exact filter point (Heading.attributes) and lists every
  user-facing surface that consumes it: fetch, proj, joins, dict vs.
  string restrictions, insert/update1, repr, describe.
- Calls out that fetch1("_name")/proj("_name") explicitly *is* allowed,
  matching the test_hidden_job_metadata.py spec.
- Adds a round-trip caveat for describe(): platform-managed hidden
  columns regenerate fine because they're re-injected on declare,
  but user-defined hidden columns (like _params_hash) are silently
  dropped from describe() output.
- Adds guidance on when to declare a hidden attribute vs. a regular one.

Aligns with #1433 (which made user-defined hidden attributes parsable
in the first place).
Expand §3.4 with a write caveat covering the three observed behaviors:

1. update1 raises "Attribute '_name' not found" — heading.names is filtered
   (heading.py:232).
2. insert raises "Field '_name' not in table heading" — Heading.__iter__
   walks the filtered view (heading.py:367).
3. insert(..., ignore_extra_fields=True) silently *drops* the hidden key
   without writing it. Less obvious than the loud error and easy to miss.

Also note that platform-managed hidden columns (_job_start_time, etc.)
are populated by DataJoint internals via raw SQL during populate()
(autopopulate.py:786), not via insert/update1. There is no public-API
path to write to a hidden column today; users with a declared hidden
column must reach for connection.query() or compute the value inside
an auto_populate step.

Tracks the write side of the gap that #1441 leaves open.
The previous "when to declare hidden" paragraph allowed too much: backing
an index was treated as sufficient reason to hide. It isn't. The clean
heuristic is: if application code touches the column (computes it,
inserts it, queries on it, wants it in describe() output), it should be
a regular attribute. Hidden is for platform/implementation concerns the
application code never references — _job_* populated by autopopulate
internals, _singleton's implementation pattern, or fields that would
actively interfere with natural-join semantics.

Use the params_hash-with-unique-index case as a concrete example of
when NOT to hide: even though it backs an index, the application code
computes and inserts the hash, so it should be regular and let proj()
handle visibility at the call site if needed.
Updated to reflect the design decision in datajoint/datajoint-python#1441:
the parser keeps rejecting leading-underscore attribute names and now
returns a clear DataJointError instead of a cryptic ParseException.

Reframe §3.4 around the platform-managed-only intent:

- Lead paragraph states up-front that user-defined hidden attributes are
  not supported, and shows the new error message users will see.
- Drop the "User-defined hidden attributes" subsection and the
  _params_hash hidden example.
- Keep the platform-attributes table and the behavior matrix — both are
  still useful for users encountering platform-managed hidden columns
  (_job_start_time, etc.) in fetch results, joins, and describe output.
- Add an explanation paragraph ("Why users can't declare them") covering
  the no-write-path / no-round-trip / silent-filter rationale.
- Replace the user-defined example with a regular-attribute example
  (params_hash backing a unique index), demonstrating the recommended
  pattern: declare as a regular attribute, use proj() at the call site
  for visibility control.
@dimitri-yatsenko dimitri-yatsenko changed the title docs: clarify hidden attribute behavior, frame as platform feature docs: hidden attributes are platform-only; clarify users cannot declare them Apr 29, 2026
Copy link
Copy Markdown
Collaborator

@MilagrosMarin MilagrosMarin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @dimitri-yatsenko for this thorough rework! The reframing of §3.4 as platform-only — with the upfront error message, the platform-managed table, the "why users can't declare them" rationale, and the regular-attribute alternative — is a real improvement over the previous version. Verified the companion PR datajoint/datajoint-python#1441 (merged), and the error message block in §3.4 matches the code in declare.py:858 verbatim. ✅

Most of the behavior matrix also checks out against datajoint-python master:

  • heading.attributes / heading.names / heading.primary_key exclude hidden ✅ (heading.py:230-247)
  • heading._attributes includes hidden ✅ (heading.py:204)
  • to_dicts / to_pandas default exclude ✅ (expression.py:899)
  • Natural-join namesake matching excludes hidden ✅ (expression.py:397-398)
  • Dict restriction silently ignored ✅ (condition.py:392)
  • String restriction passes through to SQL ✅
  • describe() excludes ✅ (table.py:1233)
  • ignore_extra_fields=True silently drops hidden ✅ (table.py:1443)
  • Platform-managed columns populated via raw SQL during populate() ✅ (autopopulate.py:766-789)

A few comments below — one critical accuracy issue on the fetch("_name") / proj("_name") rows of the matrix (and the corresponding example code), plus a minor wording nit on the insert/update1 row.

Comment thread src/reference/specs/table-declaration.md Outdated
Comment thread src/reference/specs/table-declaration.md Outdated
Comment thread src/reference/specs/table-declaration.md Outdated
…essages

Per Milagros's review on PR #162: the matrix rows for fetch("_name") and
proj("_name") said "Included" but the actual behavior is "Rejected" —
both route through proj()'s heading.names check (visible-only list at
heading.py:236-237), which raises DataJointError. The integration test
tests/integration/test_hidden_job_metadata.py:170-172 confirms this
constraint by dropping to raw SQL via conn.query() to inspect hidden
columns.

The "Inspecting platform-managed hidden columns" example block had the
same bug — the proj()/fetch1() examples would raise as written. Replaced
with the raw-SQL pattern that mirrors the integration test.

Also tightened the insert/update1 row: the previous parenthetical
"(Field not in table heading)" was an inexact paraphrase. insert/insert1
raise KeyError("`_name` is not in the table heading") (table.py:1424);
update1 raises DataJointError("Attribute `_name` not found.")
(table.py:514). Split into two rows with the verbatim messages.
@dimitri-yatsenko
Copy link
Copy Markdown
Member Author

Thank you @MilagrosMarin for the careful read — and especially for catching the fetch("_name") / proj("_name") claim. That was a real factual error inherited from the pre-PR §3.4 that would have misled users into writing code that raises. Equally helpful was confirming the constraint via the tests/integration/test_hidden_job_metadata.py:170-172 raw-SQL workaround — that's exactly the citation the docs needed to point at.

All three comments addressed in 20224ae:

  1. Matrix rows for fetch("_name") / proj("_name") — flipped from Included to Rejected (Attribute not found) — use raw SQL via conn.query(...).
  2. Inspecting platform-managed hidden columns example block — replaced the would-have-raised proj()/fetch1() snippets with the raw-SQL pattern that mirrors the integration test.
  3. insert/update1 row — split into two rows with the verbatim error messages: KeyError("\_name` is not in the table heading") (table.py:1424) and DataJointError("Attribute `_name` not found.") (table.py:514`).

Ready for another pass when you have a moment.

Copy link
Copy Markdown
Collaborator

@MilagrosMarin MilagrosMarin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @dimitri-yatsenko! Re-read §3.4 at 20224ae and verified all three:

Matrix rows for fetch("_name") / proj("_name") — now correctly state Rejected (Attribute not found) — use raw SQL via conn.query(...). Matches the actual proj() validation against heading.names (expression.py:574) and the integration-test workaround.

Example block — replaced with the conn.query(...) raw-SQL pattern that mirrors tests/integration/test_hidden_job_metadata.py:170-172. Reads cleanly and is now actually runnable.

Insert/update1 rows — split into two rows with verbatim error messages: KeyError("`_name` is not in the table heading") (table.py:1424) for insert/insert1, and DataJointError("Attribute `_name` not found.") (table.py:514) for update1. Matrix is now precise.

The whole §3.4 section now reads as a clean, accurate platform-behavior reference. LGTM — approving.

@MilagrosMarin MilagrosMarin merged commit dd4a4ca into main Apr 30, 2026
2 checks passed
@dimitri-yatsenko dimitri-yatsenko deleted the docs/hidden-attributes-platform-framing branch May 5, 2026 16:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants