In this project, you will implement a readers/writer lock with writer priority. Your tasks are to:
- Implement the lock API in
src/lock.c. - Run the benchmark to verify correctness of your implementation and report performance metrics.
We provide a reference.txt which provides performance numbers of our solution. You can use this as a guideline to see if there is room for improvement in your solution. The reference.txt was generated by running the run_benchmark.sh script. Use the run_benchmark.sh script to report performance numbers in your report.
The makefile builds with clang or clang++ and should work out-of-the-box on CADE.
The test file is benchmark.c. To build, run
$ make benchmarkTo run the benchmark
$ ./benchmark 100 5 10000 100The arguments to the benchmark are, in order:
- nreaders: How many reader threads to launch
- nwriters: How many writer threads to launch
- nitems: # of items in the array
- niters: # of iterations. In each iteration a thread acquires the lock once.
All arguments must be >= 1.
The initial version of the code will panic on detecting a race condition as there is no lock implemented.
$ ./benchmark 100 5 1000000 10000
Running benchmark with 100 readers, 5 writers, 1000000 items, 10000 iterations
Panic: Readers and Writers are both in lock.: Success
Aborted (core dumped)If your implementation of a lock is correct, the benchmark will report performance numbers.
$ ./benchmark 100 5 1000000 10000
Running benchmark with 100 readers, 5 writers, 10000 items, 100 iterations
Threads done, stats:
Readers: min 0.000016 ms, max 1.879325 ms, mean 0.008598 ms, std_dev 0.053511
Writers: min 0.000034 ms, max 0.423291 ms, mean 0.011356 ms, std_dev 0.039250Atomic operations are costly as they force threads to stall and synchronize. Look for opportunities to reduce the number of atomic operations you need to use.
For example, the spin lock could be implemented as:
volatile int lock = 0;
void acquire_lock() {
while(__sync_test_and_set(&lock, 1));
}
void release_lock() {
__sync_release(&lock);
}However, the number of atomic operations can be reduced by:
volatile int lock = 0;
void acquire_lock() {
while (__sync_test_and_set(&lock, 1)) {
// Non Atomic read. Ok to read some stale values here.
while (lock);
}
}
void release_lock() {
__sync_release(&lock);
}You can use perf to profile your code to help with you optimizing your code.
In order to profile the code, you will use the perf, a performance profiling tool for Linux.
To record a profile of your benchmark,
$ perf record ./benchmark 100 5 10000 100The perf command will generate a result file perf.data in the folder that you run PERF command. Then you need to use PERF again to analyze the profiling result.
$ perf report