Dependabot Information scraper for Github
This script scrapes and parses information regarding dependabot alerts for Github repositories belonging to an organization.
Primary data points parsed are open, fixed, dismissed vulnerabilities, and ecosystem (aka programming language) type of vulnerability.
This script provides flexible options for execution:
- from the Bash/Zsh command line and send summary results to a Slack channel
- from the Bash/Zsh command line and save detailed json/text/csv files locally
- as an AWS Lambda and send summary results to a Slack channel
NOTE: While this script works, it is far from ideal. Things to be aware of:
- The script should be refactored!
- The Repo class structure should be refactored for better organization.
- Consider vectorization via Pandas for improved performance and maintenance.
- Consider using pygithub module in lieu of direct API calls with urllib3.
- Bash or ZSH Shell
- A Github token with security_events scope to read private repositories is required.
- uv package manager to ensure urllib3 is installed
- Python 3 - This was developed and tested with Python 3.10. Likely to work with Python 3.6 and above.
- Optional: GitHub GH CLI - used to execute GH CLI commands
-
Set the following environment variables:
a. GH_TOKEN - Github API key
eg:export GH_TOKEN=ghp_XXXXXXXXX
Note: GH_TOKEN matches the environment variable used for GitHub GH CLI command, this enables GH CLI to be authenticated to Github. View login status with:gh auth status
b. GH_ORG - Github organization to query
eg:export GH_ORG=procurify
c. SLACK_URL - slack url for the slack webhook
eg:export SLACK_URL="https://hooks.slack.com/services/XXX"
Disregard SLACK_URL environment variable if local data is preferred, refer to Step 2. below. -
uv run ./dependabot_slack.py
alternatively, if sending to a Slack channel is not required, save the data locally:
uv run ./dependabot_slack.py local
- CSV output files are written to the current folder under ./data and org_data/ folder.
- JSON files for each repo is saved to the current folder under ./json_output, in the event manual review is needed. Note, this data can also be viewed via Github, within the reposistory, under the security tab, assuming appropriate permissions are granted.
- The aforementioned folders are in the .gitignore file to prevent potentially uploading sensitive information to github.
This script output provides SLO percentage numbers as defined by "the objectives your team must hit to meet that agreement".
Based on CVSSv3 Score, below is the SLO calculations the Slack output are based on.
CVSSv3 Score | Time to Redmediate (Days) |
---|---|
9.0-10.0 (Critical) | 15 |
7.-8.9 (High) | 30 |
4.0-6.9 (Medium) | 90 |
0.1-3.9 (Low) | 180 |
In the lambda/ folder, is a drop-in working copy of this script for AWS Lambda service.
-
The script is amended to enable global variables, due to the entry point of the lambda execution. In other words this will (should) just work.
-
The environment variables GH_TOKEN and SLACK_URL are moved to AWS Systems Manager Parameter Store, due to sensitivity. Note the code to retrieve these values from Parameter Store. Naturally, Parameter Store will require these key/values populated for the script to work.
-
GH_ORG remains as a environment variable.
-
Configure AWS EventBridge as required to trigger this lambda at a desired schedule.
-
Execution time will vary depending on number of repos associated with an organization. Recommend setting the lambda execution time limit to the maximum of 15 minutes to avoid potential timeout errors.
To configure a slack webhook refer to this page
-
Optimization considerations:
-
Slack output (presentation) is limited. Consequently, the output is less than ideal.
-
This script is maintained within a single file, for convenient deployment to AWS Lambda. While there are options to facilitate multiple files or utilize Lambda Layers, a single file deployment is most convenient.
-
Minimal use of external modules was intentional - again to minimize complexity for potential deployment to an AWS Lambda. Additionally, local testing is simplified with reduced third-party dependency.
-
Add Docstrings and type hints to the Repo Class, as well as to methods and functions.
-
Possibly refactor the script to modules. (too large of a script!)
List organization repos
List dependabot alerts
Working with Dependabot
Github Dependabot Blog
Released under the MIT License
Concerns/Questions, open an issue. Improvements, please submit a pull request.