Releases: davidesantangelo/krep
v2.3.0
What's Changed
Full Changelog: v2.2.0...v2.3.0
Performance
- Line-aligned chunk splitting for
-cmode: worker threads now split files on newline boundaries when counting matching lines, ensuring each line belongs to exactly one worker. This eliminates inaccurate counts at chunk boundaries and avoids rescanning overlap bytes. - Faster short-pattern line counting: in
-c(count lines) mode with short literal patterns, the scalarmemchr-based search is now preferred over the SIMD path for lower setup overhead per matching line. - New helper
advance_to_next_line_boundary()for efficient newline-aligned offset advancing.
Build
- Removed
-ffast-mathfrom defaultCFLAGSfor improved floating-point standards compliance.
CI
- Added toolchain info step (
cc --version,make --version) to CI builds. - Added benchmark smoke test on
ubuntu-latest: downloads a sample subtitle corpus and runsbenchmark_krep_vs_rg.shwithRUNS=1. - Installs
ripgrepon Ubuntu runners when not already present.
Tests
- New
test_multithread_only_matching_consistencytest: verifies that single-threaded and multi-threaded-o(only-matching) output produce identical match counts on a large generated file.
Docs
- Generalized UI/version wording in README.
v2.2.0
Highlights
- UI/UX refresh for terminal output.
- New polished color palette for filename, separators, matches, and text.
- Improved
-ooutput readability with styled line index. - Redesigned
--helpwith grouped sections for faster scanning.
Versioning
- Project version bumped to
2.2.0in code, docs, and build metadata.
Validation
- Local CI passed with
make ci. - GitHub Actions passed on tag
v2.2.0:CIworkflow: successReleaseworkflow: success
v2.1.0
What's New in v2.1.0
🔧 New Features
- Stdin pattern input (
-f -) — Read patterns from stdin for seamless pipeline integration. Example:echo 'pattern' | krep -f - target.txt(#33) - Gitignore support (
--gitignore) — Respect.gitignorefiles when searching recursively with-r. Supports glob patterns, directory-only rules, and negation (!) patterns (#11) - Algorithm selection (
--algo) — Override the automatic search algorithm selection. Choose betweenauto(default),bm(Boyer-Moore-Horspool), orkmp(Knuth-Morris-Pratt). Aho-Corasick is already built-in and auto-selected for multi-pattern searches (#12) - Automated release binaries — Platform binaries (Linux x86_64, macOS arm64, macOS x86_64) are now automatically built and attached to GitHub releases (#5)
📦 Platform Binaries
Binaries will be attached to this release once the CI workflow completes:
krep-linux-x86_64.tar.gzkrep-macos-arm64.tar.gzkrep-macos-x86_64.tar.gz
✅ Testing
All 171 tests pass (161 unit tests + 10 directory integration tests).
Full Changelog: v2.0.0...v2.1.0
v2.0.0
krep v2.0.0
Highlights
- Major performance improvements in
search_filethreading path (single-thread fast path + lower thread-pool overhead). - Added reproducible
krepvsripgrepbenchmark script:test/benchmark_krep_vs_rg.sh. - Expanded test coverage with real multithread consistency tests and recursive directory integration tests.
- CI hardened: build + unit tests + directory integration tests on Ubuntu and macOS.
- Recursive skip fix for minified assets (
.min.*).
Dataset Benchmark Command
curl -LO 'https://burntsushi.net/stuff/subtitles2016-sample.en.gz'
gzip -dk subtitles2016-sample.en.gz
make bench-rgv1.5.0
Highlights
- Speed up -c line counting by skipping counted lines in scalar and SIMD search paths.
- Fix case-insensitive single-byte scanning and anchored regex line starts.
- Add regression tests for multi-match line counting and single-char case-insensitive searches.
Tests
- make test
v1.4.2
This release brings significant performance improvements, expanded SIMD support, and better cross-platform compatibility.
New Features
AVX-512 SIMD Support
- Added ultra-high-performance AVX-512 search for patterns up to 64 bytes
- Automatic detection and utilization of AVX-512 instructions on supported CPUs
- Graceful fallback to AVX2/SSE4.2 on older hardware
Enhanced Memory Performance - Added prefetching (__builtin_prefetch) in search functions for better cache utilization
- Reduced MIN_CHUNK_SIZE to 2MB for improved parallelism on multi-core systems
- Added compiler optimization hints (LIKELY/UNLIKELY, HOT_FUNCTION)
Thread Pool Improvements - Adaptive mutex using PTHREAD_MUTEX_ADAPTIVE_NP where available
- Reduced thread stack size to 256KB for lower memory overhead
- Added batch task submission for improved efficiency
- Smarter thread count selection (cores - 1 for system headroom)
v1.4.1
What's Changed
- Fix heap buffer overflow in
memchr_short_searchfunction by @Bleem-Fuzzer in #27 - Fix NULL Pointer Dereference in strcmp by @Bleem-Fuzzer in #29
- Fix heap buffer overflow in regexec by @Bleem-Fuzzer in #31
New Contributors
- @Bleem-Fuzzer made their first contribution in #27
Full Changelog: v1.4.0...v1.4.1
v1.4.0
Added
- NEON SIMD Support: Implemented a fully optimized NEON SIMD search algorithm for ARM64 architectures (e.g., Apple Silicon). This significantly reduces CPU usage and improves search speed for patterns of any length.
- Small File Optimization: Added a specialized path for small files (< 64KB) using
read()instead ofmmap(), reducing system call overhead and page faults.
Changed
- Thread Optimization: Added padding to thread data structures to prevent false sharing on multicore systems, improving parallel scaling.
- Performance: General CPU usage reduction across all search modes.
Fixed
- Double Counting Bug: Fixed a logic error in the NEON search optimization that caused some matches to be counted twice when using the
-c(count) option.
v1.3.0
Performance Improvements
-
Pre-selected search algorithm: Search algorithm is now selected once before thread execution rather than redundantly in each worker thread, reducing overhead in multi-threaded searches
-
Sequential file access optimization: Added
posix_fadvisewithPOSIX_FADV_SEQUENTIALhint to encourage kernel readahead on supported platforms (Linux), improving I/O performance for large file searches -
Conditional sorting optimization: Match results are only sorted when there are 2 or more matches, avoiding unnecessary
qsortcalls for single-match cases
Memory & Resource Management
-
Improved match result merging: Introduced
match_result_merge_limited()function to efficiently handle max count limits when merging thread results, replacing the previous item-by-item loop with optimized batch operations -
Stdout buffer initialization fix: Stdout buffer is now initialized only once using a static flag, preventing redundant
setvbufcalls
User Experience
- Reduced warning noise:
madvisewarnings are now emitted only once per execution using an atomic flag, with subsequent warnings suppressed to avoid cluttering output when processing multiple files
Code Quality
-
Type definitions reorganization: Moved
search_func_ttypedef to appear beforethread_data_tstruct in header file for better logical ordering -
Removed binary artifact: Deleted
test_krepbinary from version control
Bug Fixes
- Thread result merging refactored: Simplified and more robust logic for merging thread-local match results with proper handling of max count limits and error conditions
v1.2.1
This commit introduces significant improvements to the regex_search function for more robust cursor advancement and corrects command-line argument parsing in main. Tests have been updated accordingly.
krep.c:
-
regex_searchfunction:- Enhanced Cursor Advancement: Refactored the logic for advancing the search cursor (
cur) after a match or non-match. This aims to prevent infinite loops, especially with zero-length matches (e.g.,^$,a*) or whenregexecreports unusual offsets (pmatch[0].rm_eo < pmatch[0].rm_so). The new logic ensurescuralways progresses by at least one character if a zero-length match occurs or if the current position needs to be skipped (e.g., failed whole-word match). - Removed a redundant loop termination condition (
if (rem == 0 && cur != text_start) break;). - Simplified the
max_countcheck. - Improved handling when
regexecindicates an invalid match range (eo < so) by advancing past the problematic point. - Ensured consistent advancement logic when a
whole_wordcheck fails.
- Enhanced Cursor Advancement: Refactored the logic for advancing the search cursor (
-
mainfunction (Argument Parsing):- Corrected
-s(String Mode) Handling: Fixed the parsing of arguments for the-soption. The argument immediately following-sis now correctly treated as the PATTERN, and the subsequent non-option argument is taken as the STRING_TO_SEARCH. - Improved Pattern and Target Logic: Refined the logic for identifying the primary pattern when not provided by
-e,-f, or-s. - Robust Target Argument Identification: Enhanced the determination of the
target_arg(file, directory, or string to search), especially when dealing with stdin or missing file/directory arguments. - Updated error messages for missing patterns or target strings to be more specific.
- Corrected
test/test_regex.c:
TEST_ASSERTMacro:- Modified the output format to
✓ PASS: messageor✗ FAIL: message. - Removed the printing of file and line numbers for failed assertions to simplify test output.
- Modified the output format to
- Test Case Cleanup: Removed numerous verbose comments and explanations from individual test functions, aiming for conciseness.
test_regex_vs_literal_performance:- Updated the initialization of
bm_paramsto be more explicit and ensure proper memory management (allocation and cleanup). - Slightly adjusted the generation of
large_textfor the performance benchmark. - Switched to using
regex_search_compatfor the regex part of the performance test.
- Updated the initialization of
- Minor cleanups in other test functions, such as removing redundant comments.
These changes enhance the reliability of krep's search functionality and its command-line interface.