Skip to content

[Asset Discovery] Correlate between Asset Discovery entities and third-party audit logs #4086

@uri-weisman

Description

@uri-weisman

Summary

Entity Store v2 currently fails to reliably correlate user entities produced by Cloud Asset Discovery with related third-party audit logs, such as AWS CloudTrail. This leads to duplicate entities and prevents graph enrichment / flyout correlation from working as expected.

Problem

In the reported AWS example, searching for the same actor produces different entity representations:

  • A user entity with an id like:
    • user:arn:aws:iam::704479110758:user/john.doeh@elastic.co@asset_discovery
  • A generic entity with an id like:
    • arn:aws:iam::704479110758:user/john.doeh@elastic.co@asset_discovery
    • arn:aws:iam::704479110758:user/john.doeh@elastic.co

However, the related CloudTrail event is evaluated into a different user id:

  • user:john.doeh@elastic.co@aws

This means the event cannot correlate with the entity produced from Asset Discovery, even though the event contains:

  • user.entity.id = "arn:aws:iam::704479110758:user/john.doeh@elastic.co"

Impact

  • Graph entities cannot be enriched correctly across integrations.
  • Flyouts may not resolve to the expected entity.
  • Asset Discovery may create entities that are effectively isolated from related audit log activity.

Discussion / Design Questions

There are two related design gaps surfaced by this case:

  1. Namespace alignment

    • Cloud Asset Discovery currently appears to publish under its own namespace (asset_discovery).
    • We may instead want it to contribute to provider namespaces such as aws, gcp, and azure / entra_id, similar to how other integrations are grouped.
  2. Official support for third-party identifiers

    • We appear to be missing a consistent, official way to publish and correlate third-party IDs (for example AWS ARNs).
    • user.id may be one candidate, but current publishing does not preserve the ARN in a way that supports this use case.
    • We should review ECS usage and decide where these identifiers should live so they can be used consistently for correlation.

Proposed direction

Investigate and define the correct correlation model for Asset Discovery and similar integrations:

  • Decide whether Asset Discovery should remain a standalone namespace or publish into cloud-provider namespaces.
  • Ensure user entities can be correlated using stable third-party identifiers when available.
  • Review whether getEuidEsqlEvaluation is behaving incorrectly for this scenario, or whether a different utility / data model is needed.
  • Avoid publishing duplicate or unusable entity variants when they cannot participate in correlation.

Acceptance criteria

  • A Cloud Asset Discovery user and related AWS CloudTrail events can correlate to the same entity in Entity Store.
  • The chosen identifier strategy is documented for third-party identities such as AWS ARNs.
  • We do not publish redundant user/generic entities that cannot be used for enrichment.
  • The solution is evaluated against similar integrations beyond Asset Discovery.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions