fix: full project modernization, tests, python 3.10 compatibility #38
Open
zbowling wants to merge 1 commit intointernetarchive:masterfrom
Open
fix: full project modernization, tests, python 3.10 compatibility #38zbowling wants to merge 1 commit intointernetarchive:masterfrom
zbowling wants to merge 1 commit intointernetarchive:masterfrom
Conversation
Author
|
I hit a few bugs and when I went to try and debug I noticed this project didn't use a lot of modern python things. I went nuts here. You don't have to accept this form like this because I know it's a lot but I added signficant tests, modernized the code, switch the arg parser, etc. Brought the code from 2012 era python to 2025 python. |
67d3557 to
8668e22
Compare
…esting This commit modernizes warctools for Python 3.10+ with comprehensive improvements to code quality, testing, and tooling: Project Structure: - Migrate to src/ layout for proper package structure - Move hanzo package to src/hanzo/ - Add src/warctools/ for backward compatibility re-exports - Update build system to uv_build backend Code Modernization: - Remove all __future__ imports (Python 3.10+ only) - Add comprehensive type hints throughout codebase - Migrate all CLI tools from optparse to click (100% argument compatible) - Update f-string usage and modernize string formatting - Remove unnecessary object inheritance (UP004) - Fix all linting issues (ruff, mypy) systematically Dependencies & Build: - Increment version to 6.0.0 - Update requires-python to >=3.10 - Add click>=8.0.0 dependency - Switch from setuptools to uv_build - Add dev dependencies: pytest, ruff, mypy Testing: - Add comprehensive integration test suite (15 tests) - Add CLI help tests for all tools - Fix legacy unittest offset calculation bugs - All 33 tests passing (integration + unit + CLI) CI/CD: - Add GitHub Actions workflow for automated testing - Update Travis CI configuration for modern Python versions - Run ruff format, ruff check, mypy, and pytest in CI Bug Fixes: - Fix gzip member offset tracking in GzipRecordStream - Fix RecordStream offset calculation for accurate record positioning - Fix exception handling and error messages - Fix variable naming issues (B007, N806, E741) - Fix import ordering and unused imports Documentation: - Add AGENTS.md for future AI agent guidance - Document project layout, build process, and tool preferences All tools tested and verified working on real-world WARC archives.
8668e22 to
5506102
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This commit modernizes warctools for Python 3.10+ with comprehensive improvements to code quality, testing, and tooling:
Project Structure:
Code Modernization:
Dependencies & Build:
Testing:
CI/CD:
Bug Fixes:
Documentation:
All tools tested and verified working on real-world WARC archives.