Skip to content

fix: enforce policy-based access control on artifact downloads#7009

Open
ycombinator wants to merge 10 commits into
elastic:mainfrom
ycombinator:fix/artifact-access-control
Open

fix: enforce policy-based access control on artifact downloads#7009
ycombinator wants to merge 10 commits into
elastic:mainfrom
ycombinator:fix/artifact-access-control

Conversation

@ycombinator
Copy link
Copy Markdown
Contributor

@ycombinator ycombinator commented May 11, 2026

What is the problem this PR solves?

The artifact download endpoint (/api/fleet/artifacts/{id}/{sha256}) only validates the agent's API key but never checks whether the requested artifact belongs to the agent's assigned policy. This means an agent enrolled under one policy can download artifacts belonging to a different policy if it knows the artifact ID and SHA256 hash. For example, an agent enrolled under a policy with no integrations can retrieve Elastic Defend trust lists, exception lists, and other security artifacts from another policy.

How does this PR solve the problem?

Implements the authorizeArtifact() function (previously a no-op that returned nil) to enforce policy-based access control:

  1. Adds a GetPolicy(ctx, policyID) method to the policy.Monitor interface that returns the cached policy for a given ID (reloads from ES on cache miss).
  2. In authorizeArtifact, fetches the agent's policy via the monitor using agent.AgentPolicyID and verifies that the requested artifact (identifier + decoded_sha256) appears in the policy's inputs[].artifact_manifest.artifacts.
  3. Returns 403 Forbidden (ErrUnauthorizedArtifact) if the artifact is not listed in the agent's assigned policy.

How to test this PR locally

  1. Set up Fleet Server with Elasticsearch and Kibana
  2. Create two agent policies: Victim-Policy with Elastic Defend integration (add a trusted application), and Attacker-Policy with no integrations
  3. Create an enrollment token for Attacker-Policy and enroll an agent
  4. Attempt to download an artifact belonging to Victim-Policy using the attacker agent's API key — should now receive 403 Forbidden instead of the artifact contents
  5. Verify that an agent enrolled under Victim-Policy can still download its own artifacts normally (200 OK)

Design Checklist

  • I have ensured my design is stateless and will work when multiple fleet-server instances are behind a load balancer.
  • I have or intend to scale test my changes, ensuring it will work reliably with 100K+ agents connected.
  • I have included fail safe mechanisms to limit the load on fleet-server: rate limiting, circuit breakers, caching, load shedding, etc.

Checklist

  • I have added tests that prove my fix is effective or that my feature works
  • I have added an entry in ./changelog/fragments using the changelog tool

The artifact download endpoint (/api/fleet/artifacts/{id}/{sha256})
previously only validated the agent's API key but never checked whether
the requested artifact belonged to the agent's assigned policy. This
allowed an agent enrolled under one policy to download artifacts from
a different policy if it knew the artifact ID and SHA256 hash.

Add authorizeArtifact implementation that fetches the agent's policy
from the in-memory policy monitor cache and verifies the requested
artifact appears in the policy's artifact_manifest before serving it.
Returns 403 Forbidden if the artifact is not in the agent's policy.

Resolves: elastic/security#8396

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@ycombinator ycombinator requested a review from a team as a code owner May 11, 2026 22:44
@ycombinator ycombinator requested review from macdewee and samuelvl May 11, 2026 22:44
@mergify
Copy link
Copy Markdown
Contributor

mergify Bot commented May 11, 2026

This pull request does not have a backport label. Could you fix it @ycombinator? 🙏
To fixup this pull request, you need to add the backport labels for the needed
branches, such as:

  • backport-./d./d is the label to automatically backport to the 8./d branch. /d is the digit
  • backport-active-all is the label that automatically backports to all active branches.
  • backport-active-8 is the label that automatically backports to all active minor branches for the 8 major.
  • backport-active-9 is the label that automatically backports to all active minor branches for the 9 major.

@ycombinator ycombinator requested review from michel-laterman and samuelvl and removed request for macdewee and samuelvl May 11, 2026 22:52
@ycombinator ycombinator added bug Something isn't working Team:Elastic-Agent-Control-Plane Label for the Agent Control Plane team backport-active-all Automated backport with mergify to all the active branches labels May 11, 2026
ycombinator and others added 2 commits May 11, 2026 15:57
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@ycombinator ycombinator requested review from cmacknz and kruskall May 11, 2026 23:09
Comment thread internal/pkg/api/handleArtifacts.go Outdated
if !ok {
continue
}
amMap, ok := am.(map[string]interface{})
Copy link
Copy Markdown
Contributor

@michel-laterman michel-laterman May 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

go's newer conventions prefers any over interface{}
this is being enforced by the go fix check that can be ran with mage check:fix

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated in b00c5df.

Comment thread internal/pkg/api/handleArtifacts.go Outdated

func policyHasArtifact(pd *model.PolicyData, id, sha2 string) bool {
for _, input := range pd.Inputs {
am, ok := input["artifact_manifest"]
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we define this artifact_manifest as a struct somewhere?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added model.ArtifactManifest and model.ManifestEntry structs in 3e57a6e.

ycombinator and others added 2 commits May 12, 2026 05:49
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Defines model.ArtifactManifest and model.ManifestEntry structs so
policyHasArtifact no longer navigates untyped map[string]any chains.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@github-actions

This comment has been minimized.

@ycombinator ycombinator enabled auto-merge (squash) May 12, 2026 19:50
ycombinator and others added 2 commits May 12, 2026 16:17
schema.go is code-generated and gets overwritten by mage generate.
Moving ArtifactManifest and ManifestEntry to ext.go keeps them stable.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
ArtifactManifest and ManifestEntry are not ES document types and only
exist to support parsing within handleArtifacts.go, so they belong there
rather than in the model package.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor

@michel-laterman michel-laterman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

Agents enrolled under dummy-policy cannot download Elastic Defend
artifacts because that policy has no artifact_manifest. Enroll the test
agent under security-policy (which has the Elastic Defend integration)
instead.

Add FleetPolicyHasArtifact scaffold helper that polls .fleet-policies
until the policy document references the artifact, ensuring fleet-server's
policy monitor cache is up-to-date before the download attempt. Also
retry the download on 403 to tolerate any remaining cache propagation lag.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Comment thread internal/pkg/api/handleArtifacts.go Outdated
if !ok {
return nil, nil
}
data, err := json.Marshal(raw)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Marshalling to then unmarshal feels a bit weird, there's no way to just unmarshal this? Maybe an intermediate type that has artifact_manifest as json.RawMessage instead of an any?

Copy link
Copy Markdown
Contributor Author

@ycombinator ycombinator May 15, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed the marshal-then-unmarshal; agree, it was awkward. Just went with directly accessing map fields, with type assertions for safety: 535b14f

…checks

PolicyData.Inputs is already decoded as []map[string]any, so marshaling
back to JSON just to unmarshal again is unnecessary. Use type assertions
directly on the decoded map in both policyHasArtifact (production) and
policyInputHasArtifact (e2e scaffold), removing the artifactManifest and
manifestEntry types along with parseArtifactManifest.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@ycombinator ycombinator force-pushed the fix/artifact-access-control branch from 2e1103d to 535b14f Compare May 15, 2026 13:42
@github-actions

This comment has been minimized.

@github-actions

This comment has been minimized.

@github-actions

This comment has been minimized.

@github-actions

This comment has been minimized.

…ntext

The ES global checkpoint monitor can hold up to 4 minutes before the
policy cache refreshes. On slow CI the setup steps (FleetHasArtifacts +
FleetPolicyHasArtifact) could exhaust the 3-minute budget, leaving the
retry loop to fail with a misleading "context deadline exceeded" from the
HTTP call. Raise the budget to 5 minutes and add an explicit ctx.Err()
check at the top of the retry loop so expiry surfaces a clear message.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@github-actions
Copy link
Copy Markdown
Contributor

TL;DR

Buildkite failed in Run check-ci because mage check:ci modified a tracked file, so the no-changes gate failed. The immediate fix is to commit the header update in internal/pkg/api/handleArtifacts_test.go (license text changed to Elastic License 2.0).

Remediation

  • Update and commit the file header in internal/pkg/api/handleArtifacts_test.go to the expected Elastic License 2.0 form (or run mage check:ci locally and commit all resulting changes).
  • Re-run mage check:ci (or CI) to confirm the tree stays clean after Generate, Imports, Fix, Headers, and Notice.
Investigation details

Root Cause

check_ci.sh runs mage check:ci, which includes Check.NoChanges.

  • magefile.go#L641-L642: Check.Ci() runs Generate, Check.Imports, Check.Fix, Check.Headers, Check.Notice, Check.NoChanges.
  • magefile.go#L612-L626: Check.NoChanges() fails if git update-index --refresh or later diff checks see modified files.

The Buildkite log shows a diff for internal/pkg/api/handleArtifacts_test.go where the header changed from:

// ... Licensed under the Elastic License;
// you may not use this file except in compliance with the Elastic License.

to:

// ... Licensed under the Elastic License 2.0;
// you may not use this file except in compliance with the Elastic License 2.0.

That mutation made the repo dirty and triggered:
Error: git update-index failure: running "git update-index --refresh" failed with exit code 1.

Evidence

  • Build: https://buildkite.com/elastic/fleet-server/builds/14740
  • Job/step: Run check-ci (.buildkite/scripts/check_ci.sh)
  • Key log excerpt: /tmp/gh-aw/buildkite-logs/fleet-server-white_check_mark-run-check-ci.txt shows the header diff and internal/pkg/api/handleArtifacts_test.go: needs update immediately before the git update-index failure.

Verification

  • Not run in this workflow; diagnosis is from Buildkite logs and repository mage target wiring.

Follow-up

  • If this file was newly added in the PR, apply the repository’s current license header template for new Go files to avoid repeated check:ci failures.

Note

🔒 Integrity filter blocked 4 items

The following items were blocked because they don't meet the GitHub integrity level.

To allow these resources, lower min-integrity in your GitHub frontmatter:

tools:
  github:
    min-integrity: approved  # merged | approved | unapproved | none

What is this? | From workflow: PR Buildkite Detective

Give us feedback! React with 🚀 if perfect, 👍 if helpful, 👎 if not.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backport-active-all Automated backport with mergify to all the active branches bug Something isn't working Team:Elastic-Agent-Control-Plane Label for the Agent Control Plane team

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants