-
Notifications
You must be signed in to change notification settings - Fork 4
Azure monitoring alerts and logging improvements #88
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
- fix: dependabot build fails (no secrets) - remove hard-coded location etc. - add alerts (prod)
WalkthroughThis PR introduces Azure monitoring and alerting infrastructure via new Bicep modules, updates deployment and cleanup workflows to support automated alert rule deployment with tag-based resource identification, adds comprehensive logging and alert documentation, and configures structured logging for Azure Functions runtime. Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~22 minutes Possibly related PRs
Suggested labels
Pre-merge checks and finishing touches✅ Passed checks (3 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
🚀 Deployed to Azure FunctionsEnvironment: This environment will be automatically cleaned up when the PR is closed. |
- logs not appearing in App Insights
🚀 Deployed to Azure FunctionsEnvironment: This environment will be automatically cleaned up when the PR is closed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
🧹 Nitpick comments (8)
.github/workflows/dotnet.yml (1)
49-60: Consider duplicate comment prevention.The step will run on every Dependabot PR workflow execution. If the PR has multiple pushes or re-runs, this could create duplicate comments.
🔎 Optional: Add duplicate comment check
- name: Comment on Dependabot PR if: github.actor == 'dependabot[bot]' && github.event_name == 'pull_request' uses: actions/github-script@v7 with: github-token: ${{ secrets.GITHUB_TOKEN }} script: | + // Check for existing comment to avoid duplicates + const comments = await github.rest.issues.listComments({ + issue_number: context.issue.number, + owner: context.repo.owner, + repo: context.repo.repo + }); + const hasComment = comments.data.some(c => + c.user.login === 'github-actions[bot]' && + c.body.includes('Build Successful (Dependabot)') + ); + if (hasComment) return; + github.rest.issues.createComment({ issue_number: context.issue.number, owner: context.repo.owner, repo: context.repo.repo, body: `### ✅ Build Successful (Dependabot)\n\n**Note:** Integration tests requiring secrets were skipped for security.\n\n- ✅ Build completed\n- ✅ Unit tests passed\n- ⏭️ Integration tests skipped (no secrets)\n- ⏭️ Azure deployment skipped\n\nOnce merged, full integration tests and deployment will run automatically.` })docs/AZURE-LOGGING-QUERIES.md (2)
26-28: Add language specifier to fenced code blocks.Per static analysis (markdownlint MD040), these navigation path code blocks should have a language specifier. Using
textis appropriate for plain text content.🔎 Suggested fix
-``` +```text Function App → Monitoring → Log streamApply similar fix to lines 34 and 262. </details> Also applies to: 34-36, 262-264 --- `279-281`: **Consider updating deprecated instrumentation key reference.** `APPINSIGHTS_INSTRUMENTATIONKEY` is being deprecated in favor of `APPLICATIONINSIGHTS_CONNECTION_STRING`. Consider updating the troubleshooting guidance to mention both, with the connection string as the primary reference. <details> <summary>🔎 Suggested update</summary> ```diff **Logs not appearing:** -- Check Application Insights connection string is configured -- Verify `APPINSIGHTS_INSTRUMENTATIONKEY` in function app settings +- Check `APPLICATIONINSIGHTS_CONNECTION_STRING` is configured in function app settings +- (Legacy) Verify `APPINSIGHTS_INSTRUMENTATIONKEY` if using older setup - Wait 2-3 minutes for logs to appear in Application Insights.azure/function-app.bicep (1)
84-90: Consider using managed identity for storage access.The storage connection strings use account keys, which is the standard pattern. When feasible, consider migrating to managed identity-based access (
AzureWebJobsStorage__accountName) to eliminate key rotation concerns.This is a future improvement opportunity, not a blocker for this PR.
.github/workflows/cleanup-azure-functions.yml (1)
95-105: Consider logging when storage accounts are found but have unexpected tags.If the deployment fails partway through or tags aren't applied correctly, storage accounts might exist without the expected tags. The current logic handles this gracefully, but a debug log could aid troubleshooting.
🔎 Optional: Add debug logging for all storage accounts
+ # Debug: List all storage accounts for visibility + echo "All storage accounts in resource group:" + az storage account list \ + --resource-group ${{ env.AZURE_RESOURCE_GROUP }} \ + --query "[].{name:name, env:tags.Environment, func:tags.FunctionApp}" \ + --output table || true + if [ -n "$STORAGE_ACCOUNTS" ]; then for STORAGE in $STORAGE_ACCOUNTS; dodocs/AZURE-MONITORING-ALERTS.md (1)
34-36: Consider using placeholder email in documentation.The example uses what appears to be a real email address. Consider using a generic placeholder like
[email protected]or[email protected].🔎 Suggested change
-4. Value: Your email address (e.g., `[email protected]`) +4. Value: Your email address (e.g., `[email protected]`).azure/alert-rules.bicep (2)
16-18: Consider parameterizing the Application Insights name.The Application Insights name is hard-coded as
'yolo-funk-insights'. If your naming convention changes or you deploy to different environments with different insights instances, this will cause issues.🔎 Proposed refactor to add parameter
@description('Resource location') param location string = resourceGroup().location + +@description('Application Insights name') +param appInsightsName string = 'yolo-funk-insights' // Get existing Application Insights resource appInsights 'Microsoft.Insights/components@2020-02-02' existing = { - name: 'yolo-funk-insights' + name: appInsightsName }
137-142: Timer trigger failure detection is heuristic-based.The query searches for "Timer trigger" with "error" or "failed" in trace messages. This is a reasonable heuristic but may:
- Miss failures with different error message formats
- Generate false positives if "error" or "failed" appear in non-failure contexts
Consider supplementing this with additional telemetry or custom logging for more reliable timer trigger monitoring.
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (12)
.azure/alert-rules.bicep.azure/function-app.bicep.github/workflows/cleanup-azure-functions.yml.github/workflows/deploy-azure-functions.yml.github/workflows/dotnet.ymldocs/AZURE-LOGGING-QUERIES.mddocs/AZURE-LOGGING-TROUBLESHOOT.mddocs/AZURE-MONITORING-ALERTS.mddocs/STRATEGY-ARCHITECTURE.mdsrc/YoloFunk/appsettings.jsonsrc/YoloFunk/docs/DEPLOYMENT.mdsrc/YoloFunk/host.json
💤 Files with no reviewable changes (1)
- src/YoloFunk/docs/DEPLOYMENT.md
🧰 Additional context used
🪛 markdownlint-cli2 (0.18.1)
docs/AZURE-LOGGING-QUERIES.md
26-26: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
34-34: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
262-262: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
🔇 Additional comments (14)
.github/workflows/dotnet.yml (3)
5-7: LGTM!The permissions are appropriately scoped -
contents: readfor checkout andpull-requests: writefor commenting on Dependabot PRs.
11-12: LGTM!Good approach to conditionally skip the environment requirement for Dependabot PRs, as they cannot access repository secrets.
25-37: LGTM!Clean implementation of conditional test execution - full tests with secrets for regular PRs, filtered unit tests for Dependabot. The
Category!=Integrationfilter correctly excludes integration tests.docs/AZURE-LOGGING-QUERIES.md (1)
1-300: LGTM! Comprehensive logging guide.Well-structured documentation covering log locations, KQL queries, structured logging patterns, and troubleshooting. The guidance on disabling sampling (line 85) aligns with the
host.jsonchanges in this PR..azure/function-app.bicep (2)
31-36: LGTM! Clean tag management.Using
union()to merge tags ensures environment-specific tags take precedence while preserving any custom tags passed in. This aligns well with the tag-based cleanup logic in the workflow.
173-180: LGTM! Schedule settings now persisted in infrastructure.Good fix for the environment variable persistence issue. The NCRONTAB expressions are valid Azure Functions format (6 fields: second minute hour day month day-of-week).
src/YoloFunk/appsettings.json (1)
6-16: LGTM! Well-structured logging configuration.Adding explicit categories for
YoloFunk,YoloTrades, andYoloBrokerenables granular log filtering. The ApplicationInsights section ensures consistent log levels when writing to Application Insights.src/YoloFunk/host.json (2)
5-7: Good fix: Disabling sampling prevents log loss.Setting
isEnabled: falseensures all logs reach Application Insights. This is the correct fix for the "only seeing framework logs" issue mentioned in the documentation.
17-17: Verify timeout aligns with function execution needs.The 10-minute timeout is the maximum for Consumption plan. Ensure your rebalance operations complete well within this window, or consider Premium plan if longer execution is needed.
.github/workflows/cleanup-azure-functions.yml (1)
89-105: LGTM! Tag-based cleanup is more reliable.Using
tags.Environmentandtags.FunctionAppfor storage account discovery is a robust improvement over name-based matching. This aligns with theresourceTagsimplementation infunction-app.bicep.docs/AZURE-MONITORING-ALERTS.md (1)
1-140: LGTM! Clear and actionable documentation.Good explanation of the environment variable persistence issue, with concrete steps for setting up email alerts. The troubleshooting section covers common issues effectively.
docs/AZURE-LOGGING-TROUBLESHOOT.md (1)
1-113: LGTM! Excellent quick-reference troubleshooting guide.This document provides actionable steps for the most common logging issue. The quick reference table (lines 99-106) and common issues checklist (lines 107-113) are particularly useful for rapid diagnosis.
.azure/alert-rules.bicep (1)
79-118: LGTM! Exception alert implementation looks solid.The scheduled query rule correctly filters exceptions by
cloud_RoleNameand alerts on any exception count. The 5-minute evaluation frequency with a 15-minute window provides good balance between responsiveness and noise reduction..github/workflows/deploy-azure-functions.yml (1)
65-70: LGTM! Dependabot skip logic is correct.Properly prevents deployment failures for Dependabot PRs that lack access to secrets. This is a best practice for workflows requiring authenticated Azure operations.
| name: 'FunctionExecutionFailures' | ||
| metricName: 'FunctionExecutionCount' | ||
| dimensions: [ | ||
| { | ||
| name: 'SuccessStatus' | ||
| operator: 'Include' | ||
| values: [ | ||
| 'False' | ||
| ] | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🌐 Web query:
Azure Functions FunctionExecutionCount metric dimensions official documentation
💡 Result:
Short answer: The FunctionExecutionCount metric for Microsoft.Web/sites supports the "Instance" dimension. See Microsoft docs: Supported metrics (Microsoft.Web/sites) table and the provider REST metadata showing dimensions. [1][2][3]
Sources:
- Supported metrics — Microsoft.Web/sites (Azure Monitor). [1]
- Monitoring data reference — Azure Functions (shows FunctionExecutionCount with Instance). [2]
- App Service provider REST metadata (List Operations) — includes FunctionExecutionCount with dimensions: Instance/ServerName. [3]
Correct the metric dimension name for FunctionExecutionCount.
The dimension name 'SuccessStatus' is not supported for the FunctionExecutionCount metric. According to Azure Monitor documentation, FunctionExecutionCount supports the 'Instance' or 'ServerName' dimensions, not status-based filtering. Use one of these standard dimensions instead.
🤖 Prompt for AI Agents
In .azure/alert-rules.bicep around lines 54 to 63, the metric dimension name
'SuccessStatus' is invalid for FunctionExecutionCount; replace it with a
supported dimension such as 'Instance' or 'ServerName' and update the values
array to contain the appropriate instance/server identifiers (or remove values
if you intend to include all). Ensure the operator remains appropriate (e.g.,
'Include') and that the chosen dimension and values match the Function app
telemetry naming.
Summary by CodeRabbit
Release Notes
New Features
Documentation
Configuration
✏️ Tip: You can customize this high-level summary in your review settings.