-
Notifications
You must be signed in to change notification settings - Fork 48
[Asset Discovery] Correlate between Asset Discovery entities and third-party audit logs #4086
Description
Summary
Entity Store v2 currently fails to reliably correlate user entities produced by Cloud Asset Discovery with related third-party audit logs, such as AWS CloudTrail. This leads to duplicate entities and prevents graph enrichment / flyout correlation from working as expected.
Problem
In the reported AWS example, searching for the same actor produces different entity representations:
- A
userentity with an id like:user:arn:aws:iam::704479110758:user/john.doeh@elastic.co@asset_discovery
- A
genericentity with an id like:arn:aws:iam::704479110758:user/john.doeh@elastic.co@asset_discoveryarn:aws:iam::704479110758:user/john.doeh@elastic.co
However, the related CloudTrail event is evaluated into a different user id:
user:john.doeh@elastic.co@aws
This means the event cannot correlate with the entity produced from Asset Discovery, even though the event contains:
user.entity.id = "arn:aws:iam::704479110758:user/john.doeh@elastic.co"
Impact
- Graph entities cannot be enriched correctly across integrations.
- Flyouts may not resolve to the expected entity.
- Asset Discovery may create entities that are effectively isolated from related audit log activity.
Discussion / Design Questions
There are two related design gaps surfaced by this case:
-
Namespace alignment
- Cloud Asset Discovery currently appears to publish under its own namespace (
asset_discovery). - We may instead want it to contribute to provider namespaces such as
aws,gcp, andazure/entra_id, similar to how other integrations are grouped.
- Cloud Asset Discovery currently appears to publish under its own namespace (
-
Official support for third-party identifiers
- We appear to be missing a consistent, official way to publish and correlate third-party IDs (for example AWS ARNs).
user.idmay be one candidate, but current publishing does not preserve the ARN in a way that supports this use case.- We should review ECS usage and decide where these identifiers should live so they can be used consistently for correlation.
Proposed direction
Investigate and define the correct correlation model for Asset Discovery and similar integrations:
- Decide whether Asset Discovery should remain a standalone namespace or publish into cloud-provider namespaces.
- Ensure user entities can be correlated using stable third-party identifiers when available.
- Review whether
getEuidEsqlEvaluationis behaving incorrectly for this scenario, or whether a different utility / data model is needed. - Avoid publishing duplicate or unusable entity variants when they cannot participate in correlation.
Acceptance criteria
- A Cloud Asset Discovery user and related AWS CloudTrail events can correlate to the same entity in Entity Store.
- The chosen identifier strategy is documented for third-party identities such as AWS ARNs.
- We do not publish redundant user/generic entities that cannot be used for enrichment.
- The solution is evaluated against similar integrations beyond Asset Discovery.