A utility for downloading and analyzing cost data from S3 files. This tool efficiently handles data extraction, parsing, and aggregation, with special support for recovering deleted files using S3 versioning.
- Modular design separating downloading and analysis
- Efficient caching of downloaded files to avoid redundant downloads
- Support for recovering deleted files via S3 versioning
- Extensible analysis framework for custom cost metrics
- Calculation of idle vs. non-idle cost ratios
- Hourly cost averaging
- Command-line interface for easy usage
Clone the repository:
git clone https://github.com/yourusername/s3_cost_analyzer.git
cd s3_cost_analyzerInstall the package:
pip install -e .The package provides a command-line interface for common operations:
Downloading files:
s3-cost-analyze download /path/to/query_file.json --aws-profile myprofileAnalyzing costs:
s3-cost-analyze analyze /path/to/query_file.json --aws-profile myprofileManaging configuration:
s3-cost-analyze config --show
s3-cost-analyze config --set aws_profile myprofileYou can create custom analysis scripts by importing the package:
from s3_cost_analyzer.analyzer import S3CostAnalyzer
# Initialize analyzer
analyzer = S3CostAnalyzer(aws_profile='myprofile')
# Download and analyze files
results = analyzer.analyze_query('/path/to/query_file.json')
# Process results
print(f"Total cost: ${results['total_cost']:.2f}")To run the specialized idle cost analysis:
python -m analysis.idle_cost_analysis /path/to/query_file.jsonThis will:
- Download all the required files (if not already cached)
- Extract cost data from each file
- Calculate idle vs. non-idle cost ratios
- Generate a detailed report
The package is organized into several modules:
downloader.py: Handles downloading and caching of S3 filesprocessor.py: Extracts and aggregates cost data from filesanalyzer.py: Provides the main interface for analysisconfig.py: Manages configuration settingscli.py: Implements the command-line interface
Custom analysis scripts can be created in the analysis/ directory to extend the functionality.
Configuration is managed through a JSON file (default: ~/.s3_cost_analyzer.json). Available settings:
aws_profile: AWS profile name to use for authenticationdata_dir: Directory to store downloaded dataoutput_dir: Directory to store analysis resultslog_level: Logging level (INFO, DEBUG, etc.)
- Python 3.8+
- boto3
- requests
MIT