Skip to content

Checking performance of S3 compatible systems with QSFS backing #46

@scottyeager

Description

@scottyeager

This thread is meant to answer the question of whether QSFS is a suitable backing storage for some open source S3 compatible storage solution. Code and instructions to run the tests are available here.

To start, I ran the warp benchmark against Garage with both Zdbfs and raw SSD as backing storage. Here's the summary of results (further discussion below and full results below).

Garage Zdbfs vs Raw SSD

  1. Throughput (Higher is better)
    • Raw SSD showed slightly better performance in all operations:
    • DELETE: 2.86 vs 2.70 obj/s
    • GET: 128.62 vs 121.30 MiB/s
    • PUT: 43.02 vs 40.63 MiB/s
    • STAT: 8.53 vs 8.10 obj/s
    • Total: 171.64 vs 161.92 MiB/s

  2. Latency (Lower is better)
    • Raw SSD was consistently faster:
    • GET: 971.7ms vs 1020.4ms avg
    • PUT: 1684.7ms vs 1776.2ms avg
    • DELETE/STAT differences were smaller but still favored Raw SSD

Overall, the difference is pretty minor, suggesting that Zdbfs won't be a significant bottleneck for Garage.

Minio

I also tried to do the same comparison with Minio. However, I ran into significant errors when trying to run Minio on top of Zdbfs. This could be due to Minio's heavier reliance on the filesystem. I didn't investigate further.

While I did include the results for both tests below anyway, it should be noted that this test was not meant to compare Garage and Minio in a scientific way. That said, Minio does appear to have better performance in certain cases, as acknowledged by the Garage team.

Considerations and potential further testing

For this purpose, it's enough to just run Zdbfs. We are interested in the best case performance when all data is part of the local cache. Retrieving offloaded data will of course affect performance, but that's not specific to the S3 solution itself.

Metadata vs data

A primary distinction between Garage and MinIO is that Garage has a metadata system that uses an in process database (typically sqlite) to record metadata. MinIO on the other hand only uses the filesystem directly. This could mean that MinIO amplifies any performance loss incurred by using Zdbfs.

In this test, I put Garage's data and metadata both in Zdbfs. It might be smarter to store the metadata outside Zdbfs and periodically back it up to the QSFS. This could have performance benefits and also benefits for efficient use of the storage system.

Potential candidate S3 solutions

  • Garage
  • MinIO
  • Versity Gateway (is just a simple translator from filesystem to S3, which might fit well)
  • Ceph, Swift, OpenIO, Riak CS, SeaweedFS, ...?

Garage Full Benchmark

Raw SSD

Report: DELETE (860 reqs). Ran Duration: 4m57s, starting 22:15:01 UTC
 * Objects per request: 1. Concurrency: 20.
 * Average: 2.86 obj/s (297s)
 * Reqs: Avg: 29.2ms, 50%: 23.9ms, 90%: 62.5ms, 99%: 110.0ms, Fastest: 3.3ms, Slowest: 219.5ms, StdDev: 23.6ms

Throughput, split into 297 x 1s:
 * Fastest: 9.36 obj/s (1s, starting 22:16:14 UTC)
 * 50% Median: 2.00 obj/s (1s, starting 22:15:41 UTC)
 * Slowest: 0.00 obj/s (1s, starting 22:17:52 UTC)

──────────────────────────────────

Report: GET (3871 reqs). Ran Duration: 4m58s, starting 22:15:01 UTC
 * Objects per request: 1. Size: 10485760 bytes. Concurrency: 20.
 * Average: 128.62 MiB/s, 12.86 obj/s (298s)
 * Reqs: Avg: 971.7ms, 50%: 973.4ms, 90%: 1283.7ms, 99%: 1585.0ms, Fastest: 442.0ms, Slowest: 1974.1ms, StdDev: 234.0ms
 * TTFB: Avg: 118ms, Best: 35ms, 25th: 83ms, Median: 107ms, 75th: 144ms, 90th: 186ms, 99th: 309ms, Worst: 427ms StdDev: 50ms

Throughput, split into 298 x 1s:
 * Fastest: 290.1MiB/s, 29.01 obj/s (1s, starting 22:15:47 UTC)
 * 50% Median: 119.9MiB/s, 11.99 obj/s (1s, starting 22:16:31 UTC)
 * Slowest: 42.8MiB/s, 4.28 obj/s (1s, starting 22:15:38 UTC)

──────────────────────────────────

Report: PUT (1293 reqs). Ran Duration: 4m58s, starting 22:15:01 UTC
 * Objects per request: 1. Size: 10485760 bytes. Concurrency: 20.
 * Average: 43.02 MiB/s, 4.30 obj/s (298s)
 * Reqs: Avg: 1684.7ms, 50%: 1656.1ms, 90%: 2135.8ms, 99%: 2421.9ms, Fastest: 1003.9ms, Slowest: 3068.3ms, StdDev: 304.7ms

Throughput, split into 298 x 1s:
 * Fastest: 62.9MiB/s, 6.29 obj/s (1s, starting 22:16:13 UTC)
 * 50% Median: 43.7MiB/s, 4.37 obj/s (1s, starting 22:18:24 UTC)
 * Slowest: 15.2MiB/s, 1.52 obj/s (1s, starting 22:19:01 UTC)

──────────────────────────────────

Report: STAT (2578 reqs). Ran Duration: 4m57s, starting 22:15:01 UTC
 * Objects per request: 1. Concurrency: 20.
 * Average: 8.53 obj/s (297s)
 * Reqs: Avg: 24.8ms, 50%: 19.0ms, 90%: 54.2ms, 99%: 104.3ms, Fastest: 2.5ms, Slowest: 163.6ms, StdDev: 20.7ms

Throughput, split into 297 x 1s:
 * Fastest: 23.01 obj/s (1s, starting 22:15:47 UTC)
 * 50% Median: 8.50 obj/s (1s, starting 22:15:06 UTC)
 * Slowest: 0.00 obj/s (1s, starting 22:17:57 UTC)


──────────────────────────────────

Report: Total (8602 reqs). Ran Duration: 4m58s, starting 22:15:01 UTC
 * Objects per request: 1. Size: 6294869 bytes. Concurrency: 20.
 * Average: 171.64 MiB/s, 28.57 obj/s (298s)

Throughput, split into 298 x 1s:
 * Fastest: 306.8MiB/s, 56.68 obj/s (1s, starting 22:18:06 UTC)
 * 50% Median: 164.5MiB/s, 22.50 obj/s (1s, starting 22:18:51 UTC)
 * Slowest: 103.4MiB/s, 11.34 obj/s (1s, starting 22:15:38 UTC)

Zdbfs

Report: DELETE (813 reqs). Ran Duration: 4m57s, starting 22:41:06 UTC
 * Objects per request: 1. Concurrency: 20.
 * Average: 2.70 obj/s (297s)
 * Reqs: Avg: 31.8ms, 50%: 27.2ms, 90%: 70.0ms, 99%: 103.0ms, Fastest: 3.1ms, Slowest: 198.7ms, StdDev: 24.2ms

Throughput, split into 297 x 1s:
 * Fastest: 9.00 obj/s (1s, starting 22:42:49 UTC)
 * 50% Median: 2.11 obj/s (1s, starting 22:41:22 UTC)
 * Slowest: 0.00 obj/s (1s, starting 22:41:20 UTC)

──────────────────────────────────

Report: GET (3652 reqs). Ran Duration: 4m58s, starting 22:41:06 UTC
 * Objects per request: 1. Size: 10485760 bytes. Concurrency: 20.
 * Average: 121.30 MiB/s, 12.13 obj/s (298s)
 * Reqs: Avg: 1020.4ms, 50%: 1015.1ms, 90%: 1353.9ms, 99%: 1628.6ms, Fastest: 403.1ms, Slowest: 2008.3ms, StdDev: 243.6ms
 * TTFB: Avg: 125ms, Best: 21ms, 25th: 86ms, Median: 114ms, 75th: 155ms, 90th: 198ms, 99th: 334ms, Worst: 502ms StdDev: 54ms

Throughput, split into 298 x 1s:
 * Fastest: 267.7MiB/s, 26.77 obj/s (1s, starting 22:42:30 UTC)
 * 50% Median: 114.1MiB/s, 11.41 obj/s (1s, starting 22:43:56 UTC)
 * Slowest: 41.4MiB/s, 4.14 obj/s (1s, starting 22:42:21 UTC)

──────────────────────────────────

Report: PUT (1219 reqs). Ran Duration: 4m58s, starting 22:41:06 UTC
 * Objects per request: 1. Size: 10485760 bytes. Concurrency: 20.
 * Average: 40.63 MiB/s, 4.06 obj/s (298s)
 * Reqs: Avg: 1776.2ms, 50%: 1757.8ms, 90%: 2236.1ms, 99%: 2590.2ms, Fastest: 1055.4ms, Slowest: 3379.5ms, StdDev: 315.8ms

Throughput, split into 298 x 1s:
 * Fastest: 61.7MiB/s, 6.17 obj/s (1s, starting 22:41:09 UTC)
 * 50% Median: 41.9MiB/s, 4.19 obj/s (1s, starting 22:42:26 UTC)
 * Slowest: 13.0MiB/s, 1.30 obj/s (1s, starting 22:43:29 UTC)

──────────────────────────────────

Report: STAT (2442 reqs). Ran Duration: 4m57s, starting 22:41:06 UTC
 * Objects per request: 1. Concurrency: 20.
 * Average: 8.10 obj/s (297s)
 * Reqs: Avg: 26.6ms, 50%: 20.7ms, 90%: 55.6ms, 99%: 127.1ms, Fastest: 2.5ms, Slowest: 223.7ms, StdDev: 22.8ms

Throughput, split into 297 x 1s:
 * Fastest: 24.15 obj/s (1s, starting 22:45:38 UTC)
 * 50% Median: 8.00 obj/s (1s, starting 22:43:33 UTC)
 * Slowest: 0.00 obj/s (1s, starting 22:42:21 UTC)


──────────────────────────────────

Report: Total (8126 reqs). Ran Duration: 4m58s, starting 22:41:06 UTC
 * Objects per request: 1. Size: 6285520 bytes. Concurrency: 20.
 * Average: 161.92 MiB/s, 26.97 obj/s (298s)

Throughput, split into 298 x 1s:
 * Fastest: 288.8MiB/s, 56.61 obj/s (1s, starting 22:42:30 UTC)
 * 50% Median: 156.1MiB/s, 24.61 obj/s (1s, starting 22:41:25 UTC)
 * Slowest: 98.3MiB/s, 13.30 obj/s (1s, starting 22:44:51 UTC)

Minio Full Benchmark

Zdbfs

Note: run canceled early due to high prevalence of errors.

Report: DELETE (1107 reqs). Ran Duration: 4m0s, starting 23:20:14 UTC
 * Objects per request: 1. Concurrency: 20.
 * Average: 4.57 obj/s (240s)
 * Reqs: Avg: 172.3ms, 50%: 91.9ms, 90%: 549.6ms, 99%: 1100.3ms, Fastest: 2.0ms, Slowest: 1719.6ms, StdDev: 260.8ms

Throughput, split into 240 x 1s:
 * Fastest: 15.83 obj/s (1s, starting 23:24:07 UTC)
 * 50% Median: 4.28 obj/s (1s, starting 23:22:08 UTC)
 * Slowest: 0.00 obj/s (1s, starting 23:23:54 UTC)

──────────────────────────────────

Report: GET (5007 reqs). Ran Duration: 4m2s, starting 23:20:14 UTC
 * Objects per request: 1. Size: 10485760 bytes. Concurrency: 20.
 * Average: 206.18 MiB/s, 20.62 obj/s, 4290 errors (242s)
 * Errors: 4300
 - First Errors:
   * The specified key does not exist.
   * The specified key does not exist.
   * The specified key does not exist.
   * The specified key does not exist.
   * The specified key does not exist.
   * The specified key does not exist.
   * The specified key does not exist.
   * The specified key does not exist.
   * The specified key does not exist.
   * The specified key does not exist.
 * Reqs: Avg: 308.0ms, 50%: 154.8ms, 90%: 899.0ms, 99%: 1834.4ms, Fastest: 3.3ms, Slowest: 2899.0ms, StdDev: 441.9ms
 * TTFB: Avg: 269ms, Best: 25ms, 25th: 163ms, Median: 257ms, 75th: 366ms, 90th: 533ms, 99th: 669ms, Worst: 1.136s StdDev: 160ms

Throughput, split into 242 x 1s:
 * Fastest: 620.0MiB/s, 62.00 obj/s (1s, starting 23:23:38 UTC)
 * 50% Median: 198.0MiB/s, 19.80 obj/s (1s, starting 23:22:25 UTC)
 * Slowest: 10.0MiB/s, 1.00 obj/s (1s, starting 23:23:39 UTC)

──────────────────────────────────

Report: PUT (1667 reqs). Ran Duration: 4m2s, starting 23:20:14 UTC
 * Objects per request: 1. Size: 10485760 bytes. Concurrency: 20.
 * Average: 68.23 MiB/s, 6.82 obj/s (242s)
 * Reqs: Avg: 1955.8ms, 50%: 2013.3ms, 90%: 2530.3ms, 99%: 2818.1ms, Fastest: 471.1ms, Slowest: 3608.3ms, StdDev: 467.1ms

Throughput, split into 242 x 1s:
 * Fastest: 112.5MiB/s, 11.25 obj/s (1s, starting 23:22:52 UTC)
 * 50% Median: 70.5MiB/s, 7.05 obj/s (1s, starting 23:20:40 UTC)
 * Slowest: 11.3MiB/s, 1.13 obj/s (1s, starting 23:20:22 UTC)

──────────────────────────────────

Report: STAT (3326 reqs). Ran Duration: 4m1s, starting 23:20:14 UTC
 * Objects per request: 1. Concurrency: 20.
 * Average: 13.69 obj/s, 2842 errors (241s)
 * Errors: 2856
 - First Errors:
   * The specified key does not exist.
   * The specified key does not exist.
   * The specified key does not exist.
   * The specified key does not exist.
   * The specified key does not exist.
   * The specified key does not exist.
   * The specified key does not exist.
   * The specified key does not exist.
   * The specified key does not exist.
   * The specified key does not exist.
 * Reqs: Avg: 67.8ms, 50%: 51.7ms, 90%: 146.6ms, 99%: 308.1ms, Fastest: 2.2ms, Slowest: 592.4ms, StdDev: 59.4ms

Throughput, split into 241 x 1s:
 * Fastest: 43.10 obj/s (1s, starting 23:23:21 UTC)
 * 50% Median: 12.57 obj/s (1s, starting 23:22:42 UTC)
 * Slowest: 0.00 obj/s (1s, starting 23:23:32 UTC)


──────────────────────────────────

Report: Total (11107 reqs). Ran Duration: 4m2s, starting 23:20:14 UTC
 * Objects per request: 1. Size: 6300707 bytes. Concurrency: 20.
 * Average: 274.40 MiB/s, 45.69 obj/s, 7142 errors (242s)
 * Errors: 7156
 - First Errors:
   * The specified key does not exist.
   * The specified key does not exist.
   * The specified key does not exist.
   * The specified key does not exist.
   * The specified key does not exist.
   * The specified key does not exist.
   * The specified key does not exist.
   * The specified key does not exist.
   * The specified key does not exist.
   * The specified key does not exist.

Throughput, split into 242 x 1s:
 * Fastest: 706.0MiB/s, 119.90 obj/s (1s, starting 23:23:38 UTC)
 * 50% Median: 265.9MiB/s, 46.99 obj/s (1s, starting 23:20:52 UTC)
 * Slowest: 85.3MiB/s, 16.12 obj/s (1s, starting 23:20:13 UTC)

Raw SSD

Report: DELETE (1585 reqs). Ran Duration: 4m57s, starting 23:30:58 UTC
 * Objects per request: 1. Concurrency: 20.
 * Average: 5.29 obj/s (297s)
 * Reqs: Avg: 31.2ms, 50%: 29.8ms, 90%: 48.1ms, 99%: 76.0ms, Fastest: 4.2ms, Slowest: 125.6ms, StdDev: 12.9ms

Throughput, split into 297 x 1s:
 * Fastest: 14.00 obj/s (1s, starting 23:34:00 UTC)
 * 50% Median: 5.00 obj/s (1s, starting 23:35:15 UTC)
 * Slowest: 0.00 obj/s (1s, starting 23:31:32 UTC)

──────────────────────────────────

Report: GET (7147 reqs). Ran Duration: 4m57s, starting 23:30:58 UTC
 * Objects per request: 1. Size: 10485760 bytes. Concurrency: 20.
 * Average: 238.11 MiB/s, 23.81 obj/s (297s)
 * Reqs: Avg: 204.2ms, 50%: 201.1ms, 90%: 259.7ms, 99%: 329.7ms, Fastest: 51.0ms, Slowest: 423.6ms, StdDev: 41.4ms
 * TTFB: Avg: 39ms, Best: 3ms, 25th: 27ms, Median: 37ms, 75th: 48ms, 90th: 60ms, 99th: 99ms, Worst: 225ms StdDev: 17ms

Throughput, split into 297 x 1s:
 * Fastest: 421.4MiB/s, 42.14 obj/s (1s, starting 23:34:09 UTC)
 * 50% Median: 234.9MiB/s, 23.49 obj/s (1s, starting 23:35:24 UTC)
 * Slowest: 29.2MiB/s, 2.92 obj/s (1s, starting 23:34:34 UTC)

──────────────────────────────────

Report: PUT (2390 reqs). Ran Duration: 4m57s, starting 23:30:58 UTC
 * Objects per request: 1. Size: 10485760 bytes. Concurrency: 20.
 * Average: 79.46 MiB/s, 7.95 obj/s (297s)
 * Reqs: Avg: 1821.5ms, 50%: 1808.4ms, 90%: 2033.4ms, 99%: 2189.9ms, Fastest: 862.2ms, Slowest: 4493.1ms, StdDev: 141.6ms

Throughput, split into 297 x 1s:
 * Fastest: 100.6MiB/s, 10.06 obj/s (1s, starting 23:33:25 UTC)
 * 50% Median: 80.0MiB/s, 8.00 obj/s (1s, starting 23:34:03 UTC)
 * Slowest: 47.4MiB/s, 4.74 obj/s (1s, starting 23:34:33 UTC)

──────────────────────────────────

Report: STAT (4769 reqs). Ran Duration: 4m57s, starting 23:30:58 UTC
 * Objects per request: 1. Concurrency: 20.
 * Average: 15.87 obj/s (297s)
 * Reqs: Avg: 29.5ms, 50%: 26.8ms, 90%: 47.1ms, 99%: 77.9ms, Fastest: 2.1ms, Slowest: 160.4ms, StdDev: 13.9ms

Throughput, split into 297 x 1s:
 * Fastest: 31.79 obj/s (1s, starting 23:32:54 UTC)
 * 50% Median: 15.16 obj/s (1s, starting 23:33:26 UTC)
 * Slowest: 1.00 obj/s (1s, starting 23:34:34 UTC)


──────────────────────────────────

Report: Total (15891 reqs). Ran Duration: 4m57s, starting 23:30:58 UTC
 * Objects per request: 1. Size: 6293039 bytes. Concurrency: 20.
 * Average: 317.58 MiB/s, 52.91 obj/s (297s)

Throughput, split into 297 x 1s:
 * Fastest: 480.2MiB/s, 84.48 obj/s (1s, starting 23:34:09 UTC)
 * 50% Median: 313.4MiB/s, 51.14 obj/s (1s, starting 23:35:24 UTC)
 * Slowest: 94.5MiB/s, 10.45 obj/s (1s, starting 23:34:34 UTC)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions