Skip to content

Posix IO support #2

@qbacpey

Description

@qbacpey

TODO List for posix fucntions:

  • Helper functions
    • TMIO_DECL macro: provides the appropriate wrapper function names
    • TMIO_FORWARD_DECL macro: provide forward declarations for wrapped funcions
    • MAP_OR_FAIL macro: obtain the address of the actual underlying foo function call (__real_foo), in the case of LD_PRELOAD
  • Sync IO functions
    • open/open64
    • __open_2
    • openat/openat64
    • close
    • read
    • write
    • pread/pread64
    • pwrite/pwrite64
    • ftruncate
    • File not closed warning in IOtrace::Summary(void)
    • readv
    • writev
    • preadv/preadv64
    • pwritev/pwritev64
  • Async IO functions
    • aio_read/aio_read64
    • aio_write/aio_write64
    • lio_listio/lio_listio64:
      • Batch & Individual Mode: A pre-possor flag BATCH_LIO to control wheather merge all IO request into one request.
      • Err handling: According to the doc of lio_listio, request might fail or non-initlized, but TMIO cannot tell the exact state of a request, so when encounter err, TMIO would print an err to console.
      • AIO_LISTIO_MAX: Linux flag to control how many IO request can be in lio queue, while in many system it is -1, means not allow to use this API at all.
    • aio_error: We would primarily use aio_error (specifically, when it signals successful completion) to capture the actual finished time of an individual AIO request. We would not use aio_error by itself to determine the "required" completion time, as it doesn't inherently signal a user's intended wait boundary for a set of operations.
    • aio_suspend: TMIO should treat a call to aio_suspend as an explicit indication from the user that they are waiting for the specified AIO requests to complete. Therefore, the point at which aio_suspend are called (the relevant request(s) should be later confirmed complete via aio_error) would be used to record the required finished time for those AIO requests.
    • aio_return/aio_return64:
      • For actual time: aio_return further confirms the completion and could be a point to finalize the recording of the "actual" finished time if not already definitively captured.
      • For required time: If TMIO hasn't already established a "required" finish time for an AIO request (e.g., through an encompassing aio_suspend), could the call to aio_return for that request serve as a fallback to record its "required" finished time. This bhv is configurable by preprocessing flags to disable this option if needed.
  • Restructure async_write_xxx and async_read_xxx single field vector.
  • Rank-to-Call Mapping: Link specific MPI ranks to the POSIX AIO calls they generate, might involve:
    • Enhancing TMIO's data collection to capture MPI rank information alongside POSIX AIO call details if the MPI library passes such context or if it can be inferred.
    • Modifying TMIO's output to include rank-specific information for both MPI and associated POSIX AIO calls, rather than just aggregated data (e.g., total bandwidth).
  • Tests
    • Sanity checks
    • HACC-IO POSIX Async IO Benchmark

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels
    No fields configured for Feature.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions