Skip to content

[v2] Add session id to user agent string #9498

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: v2
Choose a base branch
from
Open

[v2] Add session id to user agent string #9498

wants to merge 2 commits into from

Conversation

hssyoo
Copy link
Contributor

@hssyoo hssyoo commented May 18, 2025

This PR generates and appends a session id to the user agent string.

Description

A session id is an identifier that represents an AWS CLI usage session (ie a specific task or logical grouping of tasks). It has the following properties:

  • Hashed using a randomly generated host id, TTY path (if available), and timestamp
  • A session id is reused from the cache if it has been less than 30 minutes since the id was last used

Session ids are cached in a SQLite database stored in the session table. An entry in the database consists of:

  • key: Primary key that represents a cache key, which is a hash of the host id + TTY path
  • session_id: The session id associated with a host id + TTY path
  • timestamp: Most recent timestamp of the session id's usage

A single host id is generated using uuid4 and persisted in the host_id table. The primary key is always hardcoded to 0 to ensure there's only ever 1 host id that doesn't change.

A cached session id is considered to be expired if it has been at least 30 minutes since the associated timestamp. On every AWS CLI invocation, the cache is cleared by deleting all entries with timestamps < current timestamp - (60*30) seconds. The cleanup job is processed in a daemon thread that terminates as soon as the main CLI process terminates to prevent blocking on cleanup tasks.

When determining the session id, it first checks the cache.

  • If cache hit, check if the timestamp is expired
    • If not expired, update only the timestamp and write it back to table
    • If expired, generate and update session id and timestamp and write it back to table
  • If cache miss, generate a new session id and write it to table

The session id is appended to the user agent string under the sid prefix. eg:

sid/090bcefb7f7cd3a77d9db442e378c0e1

Caveats

  • Separating sessions between 30 minutes of inactivity is arbitrary and requires client code updates if we decide to adjust it. Future improvements could involve a smarter way to determine unique sessions, outside of relying on timestamps.
  • TTY path is used to generate a session id because we want different screens to represent different sessions. There's not a reliable way to get something similar on Windows machines, so it'll only ever have 1 active session id at a time.

Why SQLite over files?

Caching session data in files can lead to race conditions when multiple processes are writing/reading to/from the same files. We can work around this by either implementing cross-file locks or writing to temporary files and moving them, but both approaches add complexity to implementation.

SQLite offers atomic primitives off the shelf, along with a few useful features:

  • Write-ahead logging for improved concurrency
  • INSERT OR REPLACEINTO for simpler writes
  • DELETE .. WHERE for bulk deletes

@hssyoo hssyoo force-pushed the session-id branch 3 times, most recently from 42ded3a to 82bdf95 Compare May 22, 2025 14:45
@hssyoo hssyoo marked this pull request as ready for review May 22, 2025 15:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant