Skip to content

Conversation

@moconnell
Copy link
Owner

@moconnell moconnell commented Dec 23, 2025

Summary by CodeRabbit

Release Notes

  • New Features

    • Added monitoring and alerting system with email notifications for function execution failures, exceptions, and timer trigger failures.
  • Documentation

    • Added Azure logging and troubleshooting guides for diagnosing and resolving issues.
    • Added comprehensive monitoring and alerts setup documentation.
  • Configuration

    • Enhanced logging configuration for improved observability and diagnostics.
    • Added strategy scheduling configurations.
    • Updated function timeout settings and improved resource tagging for operations.

✏️ Tip: You can customize this high-level summary in your review settings.

@moconnell moconnell self-assigned this Dec 23, 2025
@moconnell moconnell added bug Something isn't working enhancement New feature or request Azure labels Dec 23, 2025
@coderabbitai
Copy link

coderabbitai bot commented Dec 23, 2025

Walkthrough

This PR introduces Azure monitoring and alerting infrastructure via new Bicep modules, updates deployment and cleanup workflows to support automated alert rule deployment with tag-based resource identification, adds comprehensive logging and alert documentation, and configures structured logging for Azure Functions runtime.

Changes

Cohort / File(s) Summary
Infrastructure as Code
.azure/alert-rules.bicep
New Bicep module creating email Action Group and four alert rules (Function Execution Failures, Exceptions, Timer Trigger Failures) tied to Function App and Application Insights, outputting resource IDs.
Infrastructure as Code
.azure/function-app.bicep
Introduces dynamic Key Vault URI handling with environment suffixes; adds resource-wide tag merging (Environment, ManagedBy, FunctionApp metadata); adds schedule configurations for Yolo strategies; outputs storageAccountName.
CI/CD Workflows
.github/workflows/cleanup-azure-functions.yml
Refactors storage account discovery from name-prefix matching to tag-based filtering (Environment, FunctionApp tags); removes per-item environment validation; consolidates deletion logic.
CI/CD Workflows
.github/workflows/deploy-azure-functions.yml
Replaces global env variables with repository vars for resource group/location; adds Dependabot PR detection (deploy=false); adds conditional alert rule deployment step (prod only); updates branch filter pattern (feat\/\\*); adds workflow path triggers for build files.
CI/CD Workflows
.github/workflows/dotnet.yml
Adds workflow permissions; adds Dependabot PR detection with conditional test skipping and integration test filtering; adds post-merge comment for skipped tests.
Application Configuration
src/YoloFunk/appsettings.json
Adds Logging categories for YoloFunk, YoloTrades, YoloBroker (Information level); adds ApplicationInsights LogLevel configuration (Default: Information, Microsoft: Warning).
Application Configuration
src/YoloFunk/host.json
Disables Request sampling (isEnabled: false); adds global logLevel defaults (Information for most, Warning for Host.Aggregator); adds functionTimeout setting (00:10:00).
Documentation
docs/AZURE-LOGGING-QUERIES.md
New comprehensive guide covering Application Insights logging, KQL queries, access methods, best practices for structured logging, retention, and troubleshooting.
Documentation
docs/AZURE-LOGGING-TROUBLESHOOT.md
New quick-reference guide for diagnosing missing logs, including sampling fixes, KQL filters, Application Insights validation, and troubleshooting checklist.
Documentation
docs/AZURE-MONITORING-ALERTS.md
New comprehensive guide detailing alert setup, GitHub repository variable configuration, alert rule conditions, evaluation windows, manual setup, testing, and best practices.
Documentation
src/YoloFunk/docs/DEPLOYMENT.md
Deleted comprehensive Azure Functions deployment guide covering local setup, deployment options, configuration, infrastructure, monitoring, and CI/CD practices.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~22 minutes

Possibly related PRs

Suggested labels

documentation

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Title check ✅ Passed The title directly summarizes the main changes: introduction of Azure monitoring alerts and logging improvements across multiple configuration files, bicep modules, and documentation.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch fix/yolofunk-20251223

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions
Copy link

🚀 Deployed to Azure Functions

Environment: pr-88
Function App: yolo-funk-pr-88
URL: https://yolo-funk-pr-88.azurewebsites.net
Network: testnet

This environment will be automatically cleaned up when the PR is closed.

- logs not appearing in App Insights
@github-actions
Copy link

🚀 Deployed to Azure Functions

Environment: pr-88
Function App: yolo-funk-pr-88
URL: https://yolo-funk-pr-88.azurewebsites.net
Network: testnet

This environment will be automatically cleaned up when the PR is closed.

@moconnell moconnell marked this pull request as ready for review December 23, 2025 11:41
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Nitpick comments (8)
.github/workflows/dotnet.yml (1)

49-60: Consider duplicate comment prevention.

The step will run on every Dependabot PR workflow execution. If the PR has multiple pushes or re-runs, this could create duplicate comments.

🔎 Optional: Add duplicate comment check
      - name: Comment on Dependabot PR
        if: github.actor == 'dependabot[bot]' && github.event_name == 'pull_request'
        uses: actions/github-script@v7
        with:
          github-token: ${{ secrets.GITHUB_TOKEN }}
          script: |
+           // Check for existing comment to avoid duplicates
+           const comments = await github.rest.issues.listComments({
+             issue_number: context.issue.number,
+             owner: context.repo.owner,
+             repo: context.repo.repo
+           });
+           const hasComment = comments.data.some(c => 
+             c.user.login === 'github-actions[bot]' && 
+             c.body.includes('Build Successful (Dependabot)')
+           );
+           if (hasComment) return;
+           
            github.rest.issues.createComment({
              issue_number: context.issue.number,
              owner: context.repo.owner,
              repo: context.repo.repo,
              body: `### ✅ Build Successful (Dependabot)\n\n**Note:** Integration tests requiring secrets were skipped for security.\n\n- ✅ Build completed\n- ✅ Unit tests passed\n- ⏭️ Integration tests skipped (no secrets)\n- ⏭️ Azure deployment skipped\n\nOnce merged, full integration tests and deployment will run automatically.`
            })
docs/AZURE-LOGGING-QUERIES.md (2)

26-28: Add language specifier to fenced code blocks.

Per static analysis (markdownlint MD040), these navigation path code blocks should have a language specifier. Using text is appropriate for plain text content.

🔎 Suggested fix
-```
+```text
 Function App → Monitoring → Log stream

Apply similar fix to lines 34 and 262.
</details>


Also applies to: 34-36, 262-264

---

`279-281`: **Consider updating deprecated instrumentation key reference.**

`APPINSIGHTS_INSTRUMENTATIONKEY` is being deprecated in favor of `APPLICATIONINSIGHTS_CONNECTION_STRING`. Consider updating the troubleshooting guidance to mention both, with the connection string as the primary reference.



<details>
<summary>🔎 Suggested update</summary>

```diff
 **Logs not appearing:**
 
-- Check Application Insights connection string is configured
-- Verify `APPINSIGHTS_INSTRUMENTATIONKEY` in function app settings
+- Check `APPLICATIONINSIGHTS_CONNECTION_STRING` is configured in function app settings
+- (Legacy) Verify `APPINSIGHTS_INSTRUMENTATIONKEY` if using older setup
 - Wait 2-3 minutes for logs to appear in Application Insights
.azure/function-app.bicep (1)

84-90: Consider using managed identity for storage access.

The storage connection strings use account keys, which is the standard pattern. When feasible, consider migrating to managed identity-based access (AzureWebJobsStorage__accountName) to eliminate key rotation concerns.

This is a future improvement opportunity, not a blocker for this PR.

.github/workflows/cleanup-azure-functions.yml (1)

95-105: Consider logging when storage accounts are found but have unexpected tags.

If the deployment fails partway through or tags aren't applied correctly, storage accounts might exist without the expected tags. The current logic handles this gracefully, but a debug log could aid troubleshooting.

🔎 Optional: Add debug logging for all storage accounts
+           # Debug: List all storage accounts for visibility
+           echo "All storage accounts in resource group:"
+           az storage account list \
+             --resource-group ${{ env.AZURE_RESOURCE_GROUP }} \
+             --query "[].{name:name, env:tags.Environment, func:tags.FunctionApp}" \
+             --output table || true
+
            if [ -n "$STORAGE_ACCOUNTS" ]; then
              for STORAGE in $STORAGE_ACCOUNTS; do
docs/AZURE-MONITORING-ALERTS.md (1)

34-36: Consider using placeholder email in documentation.

The example uses what appears to be a real email address. Consider using a generic placeholder like [email protected] or [email protected].

🔎 Suggested change
-4. Value: Your email address (e.g., `[email protected]`)
+4. Value: Your email address (e.g., `[email protected]`)
.azure/alert-rules.bicep (2)

16-18: Consider parameterizing the Application Insights name.

The Application Insights name is hard-coded as 'yolo-funk-insights'. If your naming convention changes or you deploy to different environments with different insights instances, this will cause issues.

🔎 Proposed refactor to add parameter
 @description('Resource location')
 param location string = resourceGroup().location
+
+@description('Application Insights name')
+param appInsightsName string = 'yolo-funk-insights'

 // Get existing Application Insights
 resource appInsights 'Microsoft.Insights/components@2020-02-02' existing = {
-  name: 'yolo-funk-insights'
+  name: appInsightsName
 }

137-142: Timer trigger failure detection is heuristic-based.

The query searches for "Timer trigger" with "error" or "failed" in trace messages. This is a reasonable heuristic but may:

  • Miss failures with different error message formats
  • Generate false positives if "error" or "failed" appear in non-failure contexts

Consider supplementing this with additional telemetry or custom logging for more reliable timer trigger monitoring.

📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between fdb8e68 and 037138d.

📒 Files selected for processing (12)
  • .azure/alert-rules.bicep
  • .azure/function-app.bicep
  • .github/workflows/cleanup-azure-functions.yml
  • .github/workflows/deploy-azure-functions.yml
  • .github/workflows/dotnet.yml
  • docs/AZURE-LOGGING-QUERIES.md
  • docs/AZURE-LOGGING-TROUBLESHOOT.md
  • docs/AZURE-MONITORING-ALERTS.md
  • docs/STRATEGY-ARCHITECTURE.md
  • src/YoloFunk/appsettings.json
  • src/YoloFunk/docs/DEPLOYMENT.md
  • src/YoloFunk/host.json
💤 Files with no reviewable changes (1)
  • src/YoloFunk/docs/DEPLOYMENT.md
🧰 Additional context used
🪛 markdownlint-cli2 (0.18.1)
docs/AZURE-LOGGING-QUERIES.md

26-26: Fenced code blocks should have a language specified

(MD040, fenced-code-language)


34-34: Fenced code blocks should have a language specified

(MD040, fenced-code-language)


262-262: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

🔇 Additional comments (14)
.github/workflows/dotnet.yml (3)

5-7: LGTM!

The permissions are appropriately scoped - contents: read for checkout and pull-requests: write for commenting on Dependabot PRs.


11-12: LGTM!

Good approach to conditionally skip the environment requirement for Dependabot PRs, as they cannot access repository secrets.


25-37: LGTM!

Clean implementation of conditional test execution - full tests with secrets for regular PRs, filtered unit tests for Dependabot. The Category!=Integration filter correctly excludes integration tests.

docs/AZURE-LOGGING-QUERIES.md (1)

1-300: LGTM! Comprehensive logging guide.

Well-structured documentation covering log locations, KQL queries, structured logging patterns, and troubleshooting. The guidance on disabling sampling (line 85) aligns with the host.json changes in this PR.

.azure/function-app.bicep (2)

31-36: LGTM! Clean tag management.

Using union() to merge tags ensures environment-specific tags take precedence while preserving any custom tags passed in. This aligns well with the tag-based cleanup logic in the workflow.


173-180: LGTM! Schedule settings now persisted in infrastructure.

Good fix for the environment variable persistence issue. The NCRONTAB expressions are valid Azure Functions format (6 fields: second minute hour day month day-of-week).

src/YoloFunk/appsettings.json (1)

6-16: LGTM! Well-structured logging configuration.

Adding explicit categories for YoloFunk, YoloTrades, and YoloBroker enables granular log filtering. The ApplicationInsights section ensures consistent log levels when writing to Application Insights.

src/YoloFunk/host.json (2)

5-7: Good fix: Disabling sampling prevents log loss.

Setting isEnabled: false ensures all logs reach Application Insights. This is the correct fix for the "only seeing framework logs" issue mentioned in the documentation.


17-17: Verify timeout aligns with function execution needs.

The 10-minute timeout is the maximum for Consumption plan. Ensure your rebalance operations complete well within this window, or consider Premium plan if longer execution is needed.

.github/workflows/cleanup-azure-functions.yml (1)

89-105: LGTM! Tag-based cleanup is more reliable.

Using tags.Environment and tags.FunctionApp for storage account discovery is a robust improvement over name-based matching. This aligns with the resourceTags implementation in function-app.bicep.

docs/AZURE-MONITORING-ALERTS.md (1)

1-140: LGTM! Clear and actionable documentation.

Good explanation of the environment variable persistence issue, with concrete steps for setting up email alerts. The troubleshooting section covers common issues effectively.

docs/AZURE-LOGGING-TROUBLESHOOT.md (1)

1-113: LGTM! Excellent quick-reference troubleshooting guide.

This document provides actionable steps for the most common logging issue. The quick reference table (lines 99-106) and common issues checklist (lines 107-113) are particularly useful for rapid diagnosis.

.azure/alert-rules.bicep (1)

79-118: LGTM! Exception alert implementation looks solid.

The scheduled query rule correctly filters exceptions by cloud_RoleName and alerts on any exception count. The 5-minute evaluation frequency with a 15-minute window provides good balance between responsiveness and noise reduction.

.github/workflows/deploy-azure-functions.yml (1)

65-70: LGTM! Dependabot skip logic is correct.

Properly prevents deployment failures for Dependabot PRs that lack access to secrets. This is a best practice for workflows requiring authenticated Azure operations.

Comment on lines +54 to +63
name: 'FunctionExecutionFailures'
metricName: 'FunctionExecutionCount'
dimensions: [
{
name: 'SuccessStatus'
operator: 'Include'
values: [
'False'
]
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🌐 Web query:

Azure Functions FunctionExecutionCount metric dimensions official documentation

💡 Result:

Short answer: The FunctionExecutionCount metric for Microsoft.Web/sites supports the "Instance" dimension. See Microsoft docs: Supported metrics (Microsoft.Web/sites) table and the provider REST metadata showing dimensions. [1][2][3]

Sources:

  • Supported metrics — Microsoft.Web/sites (Azure Monitor). [1]
  • Monitoring data reference — Azure Functions (shows FunctionExecutionCount with Instance). [2]
  • App Service provider REST metadata (List Operations) — includes FunctionExecutionCount with dimensions: Instance/ServerName. [3]

Correct the metric dimension name for FunctionExecutionCount.

The dimension name 'SuccessStatus' is not supported for the FunctionExecutionCount metric. According to Azure Monitor documentation, FunctionExecutionCount supports the 'Instance' or 'ServerName' dimensions, not status-based filtering. Use one of these standard dimensions instead.

🤖 Prompt for AI Agents
In .azure/alert-rules.bicep around lines 54 to 63, the metric dimension name
'SuccessStatus' is invalid for FunctionExecutionCount; replace it with a
supported dimension such as 'Instance' or 'ServerName' and update the values
array to contain the appropriate instance/server identifiers (or remove values
if you intend to include all). Ensure the operator remains appropriate (e.g.,
'Include') and that the chosen dimension and values match the Function app
telemetry naming.

@moconnell moconnell changed the title Various YoloFunk Fixes (2025-12-23) Azure monitoring alerts and logging improvements Dec 23, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Azure bug Something isn't working enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants