Azure monitoring alerts and logging improvements #88

moconnell · 2025-12-23T10:26:56Z

Summary by CodeRabbit

Release Notes

New Features
- Added monitoring and alerting system with email notifications for function execution failures, exceptions, and timer trigger failures.
Documentation
- Added Azure logging and troubleshooting guides for diagnosing and resolving issues.
- Added comprehensive monitoring and alerts setup documentation.
Configuration
- Enhanced logging configuration for improved observability and diagnostics.
- Added strategy scheduling configurations.
- Updated function timeout settings and improved resource tagging for operations.

_{✏️ Tip: You can customize this high-level summary in your review settings.}

- fix: dependabot build fails (no secrets) - remove hard-coded location etc. - add alerts (prod)

coderabbitai · 2025-12-23T10:27:02Z

Walkthrough

This PR introduces Azure monitoring and alerting infrastructure via new Bicep modules, updates deployment and cleanup workflows to support automated alert rule deployment with tag-based resource identification, adds comprehensive logging and alert documentation, and configures structured logging for Azure Functions runtime.

Changes

Cohort / File(s)	Summary
Infrastructure as Code `.azure/alert-rules.bicep`	New Bicep module creating email Action Group and four alert rules (Function Execution Failures, Exceptions, Timer Trigger Failures) tied to Function App and Application Insights, outputting resource IDs.
Infrastructure as Code `.azure/function-app.bicep`	Introduces dynamic Key Vault URI handling with environment suffixes; adds resource-wide tag merging (Environment, ManagedBy, FunctionApp metadata); adds schedule configurations for Yolo strategies; outputs storageAccountName.
CI/CD Workflows `.github/workflows/cleanup-azure-functions.yml`	Refactors storage account discovery from name-prefix matching to tag-based filtering (Environment, FunctionApp tags); removes per-item environment validation; consolidates deletion logic.
CI/CD Workflows `.github/workflows/deploy-azure-functions.yml`	Replaces global env variables with repository vars for resource group/location; adds Dependabot PR detection (deploy=false); adds conditional alert rule deployment step (prod only); updates branch filter pattern (feat\/\\*); adds workflow path triggers for build files.
CI/CD Workflows `.github/workflows/dotnet.yml`	Adds workflow permissions; adds Dependabot PR detection with conditional test skipping and integration test filtering; adds post-merge comment for skipped tests.
Application Configuration `src/YoloFunk/appsettings.json`	Adds Logging categories for YoloFunk, YoloTrades, YoloBroker (Information level); adds ApplicationInsights LogLevel configuration (Default: Information, Microsoft: Warning).
Application Configuration `src/YoloFunk/host.json`	Disables Request sampling (isEnabled: false); adds global logLevel defaults (Information for most, Warning for Host.Aggregator); adds functionTimeout setting (00:10:00).
Documentation `docs/AZURE-LOGGING-QUERIES.md`	New comprehensive guide covering Application Insights logging, KQL queries, access methods, best practices for structured logging, retention, and troubleshooting.
Documentation `docs/AZURE-LOGGING-TROUBLESHOOT.md`	New quick-reference guide for diagnosing missing logs, including sampling fixes, KQL filters, Application Insights validation, and troubleshooting checklist.
Documentation `docs/AZURE-MONITORING-ALERTS.md`	New comprehensive guide detailing alert setup, GitHub repository variable configuration, alert rule conditions, evaluation windows, manual setup, testing, and best practices.
Documentation `src/YoloFunk/docs/DEPLOYMENT.md`	Deleted comprehensive Azure Functions deployment guide covering local setup, deployment options, configuration, infrastructure, monitoring, and CI/CD practices.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~22 minutes

Possibly related PRs

Feat: Azure Function Migration & Key Vault Integration #83: Modifies the same Azure deployment artifacts (.azure/function-app.bicep, CI workflows, Application Insights provisioning) with overlapping resource configuration and telemetry output adjustments.

Suggested labels

documentation

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Docstring Coverage	✅ Passed	No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Title check	✅ Passed	The title directly summarizes the main changes: introduction of Azure monitoring alerts and logging improvements across multiple configuration files, bicep modules, and documentation.

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch fix/yolofunk-20251223

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

github-actions · 2025-12-23T10:32:44Z

🚀 Deployed to Azure Functions

Environment: pr-88
Function App: yolo-funk-pr-88
URL: https://yolo-funk-pr-88.azurewebsites.net
Network: testnet

This environment will be automatically cleaned up when the PR is closed.

- logs not appearing in App Insights

github-actions · 2025-12-23T11:04:51Z

🚀 Deployed to Azure Functions

Environment: pr-88
Function App: yolo-funk-pr-88
URL: https://yolo-funk-pr-88.azurewebsites.net
Network: testnet

This environment will be automatically cleaned up when the PR is closed.

coderabbitai

Actionable comments posted: 3

🧹 Nitpick comments (8)

.github/workflows/dotnet.yml (1)
49-60: Consider duplicate comment prevention.

The step will run on every Dependabot PR workflow execution. If the PR has multiple pushes or re-runs, this could create duplicate comments.
🔎 Optional: Add duplicate comment check
      - name: Comment on Dependabot PR
        if: github.actor == 'dependabot[bot]' && github.event_name == 'pull_request'
        uses: actions/github-script@v7
        with:
          github-token: ${{ secrets.GITHUB_TOKEN }}
          script: |
+           // Check for existing comment to avoid duplicates
+           const comments = await github.rest.issues.listComments({
+             issue_number: context.issue.number,
+             owner: context.repo.owner,
+             repo: context.repo.repo
+           });
+           const hasComment = comments.data.some(c => 
+             c.user.login === 'github-actions[bot]' && 
+             c.body.includes('Build Successful (Dependabot)')
+           );
+           if (hasComment) return;
+           
            github.rest.issues.createComment({
              issue_number: context.issue.number,
              owner: context.repo.owner,
              repo: context.repo.repo,
              body: `### ✅ Build Successful (Dependabot)\n\n**Note:** Integration tests requiring secrets were skipped for security.\n\n- ✅ Build completed\n- ✅ Unit tests passed\n- ⏭️ Integration tests skipped (no secrets)\n- ⏭️ Azure deployment skipped\n\nOnce merged, full integration tests and deployment will run automatically.`
            })
docs/AZURE-LOGGING-QUERIES.md (2)
26-28: Add language specifier to fenced code blocks.

Per static analysis (markdownlint MD040), these navigation path code blocks should have a language specifier. Using text is appropriate for plain text content.
🔎 Suggested fix
-```
+```text
 Function App → Monitoring → Log stream
Apply similar fix to lines 34 and 262.
</details>


Also applies to: 34-36, 262-264

---

`279-281`: **Consider updating deprecated instrumentation key reference.**

`APPINSIGHTS_INSTRUMENTATIONKEY` is being deprecated in favor of `APPLICATIONINSIGHTS_CONNECTION_STRING`. Consider updating the troubleshooting guidance to mention both, with the connection string as the primary reference.



<details>
<summary>🔎 Suggested update</summary>

```diff
 **Logs not appearing:**
 
-- Check Application Insights connection string is configured
-- Verify `APPINSIGHTS_INSTRUMENTATIONKEY` in function app settings
+- Check `APPLICATIONINSIGHTS_CONNECTION_STRING` is configured in function app settings
+- (Legacy) Verify `APPINSIGHTS_INSTRUMENTATIONKEY` if using older setup
 - Wait 2-3 minutes for logs to appear in Application Insights
.azure/function-app.bicep (1)

84-90: Consider using managed identity for storage access.

The storage connection strings use account keys, which is the standard pattern. When feasible, consider migrating to managed identity-based access (AzureWebJobsStorage__accountName) to eliminate key rotation concerns.

This is a future improvement opportunity, not a blocker for this PR.
.github/workflows/cleanup-azure-functions.yml (1)
95-105: Consider logging when storage accounts are found but have unexpected tags.

If the deployment fails partway through or tags aren't applied correctly, storage accounts might exist without the expected tags. The current logic handles this gracefully, but a debug log could aid troubleshooting.
🔎 Optional: Add debug logging for all storage accounts
+           # Debug: List all storage accounts for visibility
+           echo "All storage accounts in resource group:"
+           az storage account list \
+             --resource-group ${{ env.AZURE_RESOURCE_GROUP }} \
+             --query "[].{name:name, env:tags.Environment, func:tags.FunctionApp}" \
+             --output table || true
+
            if [ -n "$STORAGE_ACCOUNTS" ]; then
              for STORAGE in $STORAGE_ACCOUNTS; do
docs/AZURE-MONITORING-ALERTS.md (1)
34-36: Consider using placeholder email in documentation.

The example uses what appears to be a real email address. Consider using a generic placeholder like [email protected] or [email protected].
🔎 Suggested change
-4. Value: Your email address (e.g., `[email protected]`)
+4. Value: Your email address (e.g., `[email protected]`)
.azure/alert-rules.bicep (2)
16-18: Consider parameterizing the Application Insights name.

The Application Insights name is hard-coded as 'yolo-funk-insights'. If your naming convention changes or you deploy to different environments with different insights instances, this will cause issues.
🔎 Proposed refactor to add parameter
 @description('Resource location')
 param location string = resourceGroup().location
+
+@description('Application Insights name')
+param appInsightsName string = 'yolo-funk-insights'

 // Get existing Application Insights
 resource appInsights 'Microsoft.Insights/components@2020-02-02' existing = {
-  name: 'yolo-funk-insights'
+  name: appInsightsName
 }
137-142: Timer trigger failure detection is heuristic-based.

The query searches for "Timer trigger" with "error" or "failed" in trace messages. This is a reasonable heuristic but may:

Miss failures with different error message formats

Generate false positives if "error" or "failed" appear in non-failure contexts

Consider supplementing this with additional telemetry or custom logging for more reliable timer trigger monitoring.

📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between fdb8e68 and 037138d.

📒 Files selected for processing (12)

.azure/alert-rules.bicep
.azure/function-app.bicep
.github/workflows/cleanup-azure-functions.yml
.github/workflows/deploy-azure-functions.yml
.github/workflows/dotnet.yml
docs/AZURE-LOGGING-QUERIES.md
docs/AZURE-LOGGING-TROUBLESHOOT.md
docs/AZURE-MONITORING-ALERTS.md
docs/STRATEGY-ARCHITECTURE.md
src/YoloFunk/appsettings.json
src/YoloFunk/docs/DEPLOYMENT.md
src/YoloFunk/host.json

💤 Files with no reviewable changes (1)

src/YoloFunk/docs/DEPLOYMENT.md

🧰 Additional context used

🪛 markdownlint-cli2 (0.18.1)

docs/AZURE-LOGGING-QUERIES.md

26-26: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

34-34: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

262-262: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

🔇 Additional comments (14)

.github/workflows/dotnet.yml (3)

5-7: LGTM!

The permissions are appropriately scoped - contents: read for checkout and pull-requests: write for commenting on Dependabot PRs.

11-12: LGTM!

Good approach to conditionally skip the environment requirement for Dependabot PRs, as they cannot access repository secrets.

25-37: LGTM!

Clean implementation of conditional test execution - full tests with secrets for regular PRs, filtered unit tests for Dependabot. The Category!=Integration filter correctly excludes integration tests.

docs/AZURE-LOGGING-QUERIES.md (1)

1-300: LGTM! Comprehensive logging guide.

Well-structured documentation covering log locations, KQL queries, structured logging patterns, and troubleshooting. The guidance on disabling sampling (line 85) aligns with the host.json changes in this PR.

.azure/function-app.bicep (2)

31-36: LGTM! Clean tag management.

Using union() to merge tags ensures environment-specific tags take precedence while preserving any custom tags passed in. This aligns well with the tag-based cleanup logic in the workflow.

173-180: LGTM! Schedule settings now persisted in infrastructure.

Good fix for the environment variable persistence issue. The NCRONTAB expressions are valid Azure Functions format (6 fields: second minute hour day month day-of-week).

src/YoloFunk/appsettings.json (1)

6-16: LGTM! Well-structured logging configuration.

Adding explicit categories for YoloFunk, YoloTrades, and YoloBroker enables granular log filtering. The ApplicationInsights section ensures consistent log levels when writing to Application Insights.

src/YoloFunk/host.json (2)

5-7: Good fix: Disabling sampling prevents log loss.

Setting isEnabled: false ensures all logs reach Application Insights. This is the correct fix for the "only seeing framework logs" issue mentioned in the documentation.

17-17: Verify timeout aligns with function execution needs.

The 10-minute timeout is the maximum for Consumption plan. Ensure your rebalance operations complete well within this window, or consider Premium plan if longer execution is needed.

.github/workflows/cleanup-azure-functions.yml (1)

89-105: LGTM! Tag-based cleanup is more reliable.

Using tags.Environment and tags.FunctionApp for storage account discovery is a robust improvement over name-based matching. This aligns with the resourceTags implementation in function-app.bicep.

docs/AZURE-MONITORING-ALERTS.md (1)

1-140: LGTM! Clear and actionable documentation.

Good explanation of the environment variable persistence issue, with concrete steps for setting up email alerts. The troubleshooting section covers common issues effectively.

docs/AZURE-LOGGING-TROUBLESHOOT.md (1)

1-113: LGTM! Excellent quick-reference troubleshooting guide.

This document provides actionable steps for the most common logging issue. The quick reference table (lines 99-106) and common issues checklist (lines 107-113) are particularly useful for rapid diagnosis.

.azure/alert-rules.bicep (1)

79-118: LGTM! Exception alert implementation looks solid.

The scheduled query rule correctly filters exceptions by cloud_RoleName and alerts on any exception count. The 5-minute evaluation frequency with a 15-minute window provides good balance between responsiveness and noise reduction.

.github/workflows/deploy-azure-functions.yml (1)

65-70: LGTM! Dependabot skip logic is correct.

Properly prevents deployment failures for Dependabot PRs that lack access to secrets. This is a best practice for workflows requiring authenticated Azure operations.

coderabbitai · 2025-12-23T11:45:51Z

.azure/alert-rules.bicep

+          name: 'FunctionExecutionFailures'
+          metricName: 'FunctionExecutionCount'
+          dimensions: [
+            {
+              name: 'SuccessStatus'
+              operator: 'Include'
+              values: [
+                'False'
+              ]
+            }


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🌐 Web query:

Azure Functions FunctionExecutionCount metric dimensions official documentation

💡 Result:

Short answer: The FunctionExecutionCount metric for Microsoft.Web/sites supports the "Instance" dimension. See Microsoft docs: Supported metrics (Microsoft.Web/sites) table and the provider REST metadata showing dimensions. [1][2][3]

Sources:

Supported metrics — Microsoft.Web/sites (Azure Monitor). [1]

Monitoring data reference — Azure Functions (shows FunctionExecutionCount with Instance). [2]

App Service provider REST metadata (List Operations) — includes FunctionExecutionCount with dimensions: Instance/ServerName. [3]

Correct the metric dimension name for FunctionExecutionCount.

The dimension name 'SuccessStatus' is not supported for the FunctionExecutionCount metric. According to Azure Monitor documentation, FunctionExecutionCount supports the 'Instance' or 'ServerName' dimensions, not status-based filtering. Use one of these standard dimensions instead.

🤖 Prompt for AI Agents

In .azure/alert-rules.bicep around lines 54 to 63, the metric dimension name 'SuccessStatus' is invalid for FunctionExecutionCount; replace it with a supported dimension such as 'Instance' or 'ServerName' and update the values array to contain the appropriate instance/server identifiers (or remove values if you intend to include all). Ensure the operator remains appropriate (e.g., 'Include') and that the chosen dimension and values match the Function app telemetry naming.

.github/workflows/deploy-azure-functions.yml

moconnell added 3 commits December 23, 2025 11:21

chore: consolidate docs under one folder

49b93c0

fix(ci): various tidy-ups

abaeaf9

- fix: dependabot build fails (no secrets) - remove hard-coded location etc. - add alerts (prod)

Merge remote-tracking branch 'origin/master' into fix/yolofunk-20251223

831113a

moconnell self-assigned this Dec 23, 2025

moconnell added bug Something isn't working enhancement New feature or request Azure labels Dec 23, 2025

moconnell temporarily deployed to development December 23, 2025 10:27 — with GitHub Actions Inactive

moconnell temporarily deployed to development December 23, 2025 10:29 — with GitHub Actions Inactive

fix(logs): adjust logging settings

037138d

- logs not appearing in App Insights

moconnell temporarily deployed to development December 23, 2025 11:00 — with GitHub Actions Inactive

moconnell deployed to development December 23, 2025 11:01 — with GitHub Actions Active

moconnell marked this pull request as ready for review December 23, 2025 11:41

coderabbitai bot reviewed Dec 23, 2025

View reviewed changes

moconnell changed the title ~~Various YoloFunk Fixes (2025-12-23)~~ Azure monitoring alerts and logging improvements Dec 23, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Azure monitoring alerts and logging improvements #88

Azure monitoring alerts and logging improvements #88

Uh oh!

moconnell commented Dec 23, 2025 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Dec 23, 2025 •

edited

Loading

Uh oh!

github-actions bot commented Dec 23, 2025

Uh oh!

github-actions bot commented Dec 23, 2025

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot Dec 23, 2025

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Azure monitoring alerts and logging improvements #88

Are you sure you want to change the base?

Azure monitoring alerts and logging improvements #88

Uh oh!

Conversation

moconnell commented Dec 23, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Release Notes

Uh oh!

coderabbitai bot commented Dec 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested labels

Pre-merge checks and finishing touches

Uh oh!

github-actions bot commented Dec 23, 2025

🚀 Deployed to Azure Functions

Uh oh!

github-actions bot commented Dec 23, 2025

🚀 Deployed to Azure Functions

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Dec 23, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

moconnell commented Dec 23, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Dec 23, 2025 •

edited

Loading