Add alert monitoring#2
Open
AsharMoin wants to merge 1 commit into
Open
Conversation
- Created alert rule engine for creating/managing alert rules - Added background monitoring thread that checks GCP metrics using our tool functions - Added alert creation capabilities to GenAI bot - Added alert display in main conversation loop - Supports VM CPU utilization only (for now)
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Overview
Adds proactive alert monitoring to the GCP Monitoring Bot, allowing users to set up background alerts that automatically trigger when GCP resource thresholds are exceeded, eliminating the need for manual monitoring queries.
Key Features
Background thread-based monitoring system with configurable check intervals
Alert rule management: create, list, and delete custom monitoring rules
Real-time metrics from GCP using preexisting functions (VM CPU utilization)
Alert display integrated into conversation flow
Alerts fire once then auto-delete
Technical Implementation
Usage Examples
User :> Create an alert rule with resource_type "vm", metric "cpu_utilization", threshold 80, and name it "cpu_monitor"
Bot :> Alert 'cpu_alert' created!
[60 seconds later, if CPU exceeds threshold]
🚨 ALERT TRIGGERED 🚨
Rule: cpu_alert
Current: 90.0 | Threshold: 80
Time: 14:40:29
User :>
Supported Alert Types
VM CPU Monitoring: create_alert_rule("cpu_alert", "vm", "cpu_utilization", 80)
Testing
Manual testing successful with mock cpu_utilization
Alert creation via GenAI bot commands: Working
Background monitoring thread: Working
Alert triggering and notification: Working (For the sake of the demo have to replace some logic with hard code)
Alert display in conversation flow: Working
Alert rule persistence and cleanup: Working
###Current Limitations
Demo-focused implementation: Simplified for presentation purposes, not production-ready
Limited metric types: Only VM CPU usage supported
Single VM monitoring: Checks only first VM in zone, not all instances
Bot requires very specific prompting: The bot requires very specific and clear prompting other wise it will not be able to make the alert
Console-only notifications: No email, Slack, or webhook integrations
Technical Debt
Hardcoded test CPU values in _get_metric_value() for reliable demo
No configuration file for alert settings or check intervals
Alert rule validation is minimal (trusts user input)
Breaking Change Risk
NONE - This is purely additive functionality:
All existing bot commands and functionality preserved
New alert tools added to existing tool set
Background monitoring runs independently of main conversation loop
No changes to existing GCP monitoring functions
Files Added:
core/alerts/rule_engine.py
core/alerts/alert_storage.py
core/alerts/alert_scheduler.py
core/alerts/notification_handler.py
core/alerts/init.py
Files Modified:
core/bot.py - Added alert tools to GenAI bot capabilities
main.py - Auto-start monitoring and alert cleanup on exit