This repository contains implementations of 1D, 2D, and 3D Convolution and Cross-Correlation operations with both Naive and Tiled approaches.
Try the Interactive Visualizer! - An interactive web-based visualization tool to help understand how convolution and cross-correlation algorithms work, with both naive and tiled implementations.
For your convenience, a shell script is provided to compile and run all implementations:
# Change to the coding_practice directory
cd coding_practice
# Make the script executable
chmod +x run_all_tests.sh
# Run the script
./run_all_tests.sh
The script will compile all implementations and place the executables in the bin
directory. It will then present you with a menu to choose which mode to run:
- Template Mode - Practice implementing the algorithms yourself
- Complete Implementations Mode - Run tests with fully implemented algorithms
Alternatively, you can use the provided Makefile:
# Change to the coding_practice directory
cd coding_practice
# Compile all implementations
make
# Compile only template files
make templates
# Compile only implementation files
make implementations
# Clean up all binaries
make clean
# Clean only template binaries
make clean-templates
# Clean only implementation binaries
make clean-implementations
This mode is designed for practice and learning. It allows you to:
- Select one of the algorithms (naive or tiled versions of cross-correlation and convolution)
- Implement the algorithm yourself by filling in the blank loops with the correct indices
- Test your implementation with example inputs
- Optionally view the correct answer if you get stuck
This mode provides access to all the features of the fully implemented algorithms:
- Run default tests with reasonable input sizes
- Optimize tile sizes automatically to find the best performance
- Run performance comparisons between naive and tiled implementations
- Test with custom input values in interactive mode
-
Default Values: The script uses reasonable default values for array sizes and tile sizes, so you don't have to enter them manually.
-
Tile Size Optimization: The script can automatically test multiple tile size combinations to find the optimal configuration for maximum performance.
-
Interactive Mode: If you still want to input your own values, you can use the interactive mode.
-
Template Practice: Write your own implementations and test them against the provided examples.
-
Answer Viewing: If you get stuck implementing an algorithm, you can choose to view the correct answer.
-
Organized Executables: All compiled programs are placed in the
bin
directory to keep the workspace clean. -
Makefile Support: Use
make
commands for building, cleaning, and managing the codebase.
To use these features, select the corresponding option from the menu when running the script:
- Template Mode: Practice implementing algorithms yourself
- Complete Mode: Access optimizations and performance tests
- Options 1-3: Run individual tests with default values
- Options 7-9: Run optimization to find the best tile sizes
- Option 10: Run all optimizations
- Option 0: Interactive mode (prompts for all values)
Cross-correlation is a measure of similarity between two sequences. In signal processing, it's often used to search for a particular pattern within a signal.
- Naive Cross-Correlation: A straightforward implementation that computes each output element directly.
- Tiled Cross-Correlation: An optimized implementation that processes data in tiles to improve cache utilization.
Convolution is a mathematical operation that expresses how the shape of one function is modified by another. In 1D convolution, the kernel is flipped before the operation.
- Naive 1D Convolution: A direct implementation that computes each output element by summing the products of input elements and the flipped kernel.
- Tiled 1D Convolution: An optimized implementation that processes the signal in chunks to improve cache locality.
2D convolution extends the concept to matrices, widely used in image processing for operations like blurring, sharpening, and edge detection. The kernel is flipped both horizontally and vertically.
- Naive 2D Convolution: A straightforward implementation with four nested loops to compute each output pixel.
- Tiled 2D Convolution: An implementation that processes the image in tiles to improve cache performance.
3D convolution further extends the concept to volumetric data, which is essential for processing 3D images, videos (where time is the third dimension), or scientific data like MRI scans. The kernel is flipped in all three dimensions.
- Naive 3D Convolution: A direct implementation with six nested loops (three for output positions, three for kernel positions) to compute each output voxel.
- Tiled 3D Convolution: An optimized implementation that processes the volume in 3D tiles to improve cache locality and performance.
To gain a better understanding of how these algorithms work, the Template Mode allows you to implement them yourself:
-
For each algorithm, you are provided with a template where you need to:
- Fill in the correct loop indices
- Calculate the proper array indices
- Handle special cases (like remainder tiles)
-
The templates include helpful comments that guide you through the implementation.
-
After implementing, you can run the code with example inputs to test your solution.
-
If you get stuck, you can view the complete answer to learn from it.
Each folder contains a comparison program that measures the performance difference between the naive and tiled implementations.
bin/
- Contains all compiled executables (created when you run the script or make)templates/
- Contains template files for practicecross_correlation/
- Cross-correlation implementations1d_convolution/
- 1D convolution implementations2d_convolution/
- 2D convolution implementations
# Compile the naive implementation
gcc -o bin/naive_cross_correlation cross_correlation/naive/cross_correlation.c
# Compile the tiled implementation
gcc -o bin/tiled_cross_correlation cross_correlation/tiled/tiled_cross_correlation.c
# Compile the performance comparison
gcc -o bin/cross_correlation_comparison cross_correlation/cross_correlation_comparison.c
# Run the naive implementation
bin/naive_cross_correlation
# Run the tiled implementation
bin/tiled_cross_correlation
# Run the performance comparison (recommended)
bin/cross_correlation_comparison
For the performance comparison, you'll be prompted to enter:
- Size of array A (e.g., 10000)
- Size of array B (e.g., 1000)
- Tile size for array A (e.g., 100)
- Tile size for array B (e.g., 50)
The program will generate random inputs, verify that both implementations produce the same results, and compare their performance.
# Compile the naive implementation
gcc -o bin/naive_convolution 1d_convolution/naive/convolution.c
# Compile the tiled implementation
gcc -o bin/tiled_convolution 1d_convolution/tiled/tiled_convolution.c
# Compile the performance comparison
gcc -o bin/convolution_comparison 1d_convolution/convolution_comparison.c
# Run the naive implementation
bin/naive_convolution
# Run the tiled implementation
bin/tiled_convolution
# Run the performance comparison (recommended)
bin/convolution_comparison
# Compile the naive implementation
gcc -o bin/convolution_2d 2d_convolution/naive/convolution_2d.c
# Compile the tiled implementation
gcc -o bin/tiled_convolution_2d 2d_convolution/tiled/tiled_convolution_2d.c
# Compile the performance comparison
gcc -o bin/convolution_2d_comparison 2d_convolution/convolution_2d_comparison.c
# Run the naive implementation
bin/convolution_2d
# Run the tiled implementation
bin/tiled_convolution_2d
# Run the performance comparison (recommended)
bin/convolution_2d_comparison
# Compile the naive implementation
gcc -o bin/convolution_3d 3d_convolution/naive/convolution_3d.c
# Compile the tiled implementation
gcc -o bin/tiled_convolution_3d 3d_convolution/tiled/tiled_convolution_3d.c
# Compile the performance comparison
gcc -o bin/convolution_3d_comparison 3d_convolution/convolution_3d_comparison.c
# Run the naive implementation
bin/convolution_3d
# Run the tiled implementation
bin/tiled_convolution_3d
# Run the performance comparison (recommended)
bin/convolution_3d_comparison
The script uses the following default values:
- Array A size: 50,000 elements
- Array B (kernel) size: 2,000 elements
- Tile A size: 64 elements
- Tile B size: 32 elements
- Matrix A size: 500×500
- Kernel B size: 5×5
- Tile height: 32
- Tile width: 32
- Volume A size: 50×50×50
- Kernel B size: 5×5×5
- Tile size: 8×8×8
The optimization feature tests multiple combinations of tile sizes to find the one that yields the best performance. It tests the following tile sizes:
- For tile_A and tile_B: 16, 32, 64, 128, 256 (where applicable)
- For tile_height and tile_width: 8, 16, 32, 64
For best results when comparing naive and tiled implementations:
-
Input Sizes: Use large input sizes to observe significant performance differences
- For 1D algorithms: Try array A of size 50,000-100,000 and array B of size 1,000-5,000
- For 2D algorithms: Try matrices of size 1000×1000 with kernels of size 5×5 to 20×20
-
Tile Sizes: Experiment with different tile sizes to find the optimal configuration
- For 1D algorithms: Try tile sizes between 16-256
- For 2D algorithms: Try tile sizes between 8×8 and 64×64
-
Multiple Runs: The comparison programs run multiple iterations to provide more accurate timing
- Tiling: Dividing the computation into smaller chunks that fit into cache, improving data locality and reducing cache misses.
- Cache Utilization: The tiled implementations are designed to maximize the reuse of data while it's in cache.
- Memory Access Patterns: Optimizing the order of operations to make memory accesses more sequential and predictable.
The key difference between convolution and cross-correlation is that in convolution, the kernel is flipped before the operation, while in cross-correlation, the kernel is used as is. In mathematical terms:
- Cross-Correlation: (f ⋆ g)(t) = ∫f(τ)g(t+τ)dτ
- Convolution: (f * g)(t) = ∫f(τ)g(t-τ)dτ
For symmetric kernels, the results of convolution and cross-correlation are identical.