Conversation
Add metrics for monitoring
- Debian buster sources were archived and not reachable anymore
Upgrade debian
WalkthroughAdds Prometheus-based monitoring: metrics, /metrics and /health endpoints, middleware for request/user metrics and basic-auth, instrumentation in Tag/Term/Vocabulary models, moves State enum to a separate module, updates docs/env, Docker image and compose healthcheck, and expands allowed hosts. Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Possibly related PRs
✨ Finishing Touches🧪 Generate unit tests
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. CodeRabbit Commands (Invoked using PR/Issue comments)Type Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Actionable comments posted: 4
🧹 Nitpick comments (4)
evoks/tests/model/test_term.py (1)
4-5: Drop unused imports flagged by Ruff
DataformatandStatearen’t referenced anywhere in this test module.-from vocabularies.models import Vocabulary, Dataformat -from vocabularies.state import State +from vocabularies.models import Vocabularyevoks/tests/model/test_vocabulary.py (1)
5-6: Remove unusedDataformatimportRuff correctly flags this as unused; slimming imports keeps the tests tidy.
-from vocabularies.models import Vocabulary, Dataformat -from vocabularies.state import State +from vocabularies.models import Vocabulary +from vocabularies.state import State.example.env (1)
17-19: LGTM - Metrics configuration variables added correctly.The new environment variables for metrics functionality are properly placed and follow consistent naming conventions. Consider adding inline comments to document expected values for better developer experience.
-METRICS_ENABLED= -METRICS_USER= -METRICS_PASS= +METRICS_ENABLED= # Set to 'true' to enable Prometheus metrics +METRICS_USER= # Username for metrics endpoint basic auth +METRICS_PASS= # Password for metrics endpoint basic authevoks/evoks/middleware.py (1)
66-68: Document security requirements for metrics authentication.The default credentials are weak and should be changed in production. Consider documenting the security requirements.
The implementation should enforce HTTPS for basic auth and document the security considerations:
# In the basic_auth method, consider adding: if not request.is_secure(): # Log warning about insecure basic auth over HTTP passAdditionally, document in comments or README that:
- Default credentials must be changed in production
- HTTPS should be used when metrics are enabled
- Consider implementing rate limiting for the metrics endpoint
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (16)
.example.env(1 hunks).gitignore(1 hunks)Dockerfile(1 hunks)README.md(1 hunks)docker-compose.yml(1 hunks)evoks/Tag/models.py(2 hunks)evoks/Term/models.py(3 hunks)evoks/evoks/middleware.py(1 hunks)evoks/evoks/settings.py(2 hunks)evoks/evoks/urls.py(2 hunks)evoks/evoks/views.py(2 hunks)evoks/tests/model/test_term.py(1 hunks)evoks/tests/model/test_vocabulary.py(1 hunks)evoks/vocabularies/models.py(3 hunks)evoks/vocabularies/state.py(1 hunks)requirements.txt(1 hunks)
🧰 Additional context used
🧬 Code Graph Analysis (5)
evoks/tests/model/test_vocabulary.py (2)
evoks/vocabularies/models.py (2)
Vocabulary(98-650)Dataformat(29-37)evoks/vocabularies/state.py (1)
State(4-12)
evoks/tests/model/test_term.py (2)
evoks/vocabularies/models.py (2)
Vocabulary(98-650)Dataformat(29-37)evoks/vocabularies/state.py (1)
State(4-12)
evoks/evoks/urls.py (1)
evoks/evoks/views.py (2)
metrics_view(128-133)health_view(136-137)
evoks/Term/models.py (1)
evoks/vocabularies/state.py (1)
State(4-12)
evoks/vocabularies/models.py (1)
evoks/vocabularies/state.py (1)
State(4-12)
🪛 Ruff (0.12.2)
evoks/tests/model/test_vocabulary.py
5-5: vocabularies.models.Dataformat imported but unused
Remove unused import: vocabularies.models.Dataformat
(F401)
evoks/tests/model/test_term.py
4-4: vocabularies.models.Dataformat imported but unused
Remove unused import: vocabularies.models.Dataformat
(F401)
5-5: vocabularies.state.State imported but unused
Remove unused import: vocabularies.state.State
(F401)
evoks/evoks/middleware.py
63-64: Use a single if statement instead of nested if statements
Combine if statements using and
(SIM102)
🔇 Additional comments (25)
requirements.txt (1)
18-18: Pin theprometheus-clientdependency to a version rangeLeaving the dependency unpinned can lead to silent breaking changes when a new major version is released. Consider specifying an upper bound or at least a minimum compatible version.
-prometheus-client +prometheus-client>=0.19.0,<1.0Dockerfile (1)
1-1: Confirm build & runtime compatibility after switching tobookworm
python:3.9-slim-bookwormtracks the latest 3.9 patch and a newer Debian base. This may:
- introduce newer GLibC & OpenSSL versions that break binary wheels,
- require additional system packages (e.g.,
libpq5) for some Python deps,- fail the NodeSource 22.x setup script on Bookworm until officially supported.
Please run the CI image build and smoke-tests to ensure nothing regresses; if stability is critical, consider pinning the digest (
@sha256:…) instead of a moving tag..gitignore (1)
135-138: Good call adding IDE artefacts to.gitignoreIgnoring
.DS_Storeand.ideakeeps the repo clean of local/editor noise. 👍docker-compose.yml (1)
103-103: Excellent improvement to use dedicated health endpoint.Changing the healthcheck from
/loginto/healthis a best practice - health checks should be independent of authentication logic and use a dedicated endpoint designed for monitoring purposes.evoks/evoks/urls.py (1)
3-3: Clean import of new monitoring views.The specific import of
metrics_viewandhealth_viewfollows Django best practices and maintains code clarity.evoks/vocabularies/state.py (1)
1-12: Excellent refactoring to centralize State enum.Moving the
Stateenum to a dedicated module promotes code reuse and follows the DRY principle. The implementation is clean with proper DjangoTextChoicesinheritance and clear documentation.evoks/evoks/settings.py (1)
81-81: LGTM - Docker host support added.Adding
'host.docker.internal'toALLOWED_HOSTSis appropriate for Docker container networking and follows Docker best practices.evoks/evoks/views.py (3)
1-1: Good addition of monitoring imports.The new imports support the Prometheus metrics functionality effectively. The
osimport enables environment variable checks, and the prometheus_client imports provide the necessary utilities for metrics generation.Also applies to: 7-8
127-133: Well-implemented metrics endpoint with proper security considerations.The metrics view correctly uses environment variable gating and returns appropriate responses. The
@csrf_exemptdecorator is appropriate for this monitoring endpoint since it needs to be accessible to external monitoring systems without CSRF tokens.
135-137: Simple and effective health check endpoint.The health view provides a clean, minimal health check that returns HTTP 200 OK, which is exactly what's needed for container health checks and load balancer probes.
evoks/Tag/models.py (3)
3-3: Excellent Prometheus metrics integration.The Counter and Gauge imports are correctly used, and the metrics are well-named with consistent "evoks_" prefixing. This follows Prometheus naming conventions effectively.
Also applies to: 11-12
45-45: Proper counter increment placement.The counter increment is correctly placed after the tag is saved, ensuring accurate tracking of successfully created tags.
50-50: Efficient gauge implementation using lambda.The gauge setup with a lambda function provides real-time counts without storing state. This is an efficient approach for tracking current database records.
README.md (1)
108-124: Comprehensive and well-structured monitoring documentation.The new monitoring section effectively explains the feature's capabilities, privacy considerations, and security measures. The table format for environment variables is clear and follows the established documentation pattern. The explanation of IP hashing for unique access tracking demonstrates good privacy awareness.
evoks/Term/models.py (4)
3-5: Good import organization and state refactoring.The addition of Prometheus client imports and the refactoring to import
Statefromvocabularies.stateimproves code organization by centralizing the state enum definition.
7-9: Well-designed metrics for comprehensive term tracking.The three metrics provide excellent coverage: creation events, total stored terms, and publicly available terms. The naming convention is consistent with other models.
29-29: Correct counter increment placement.The counter increment is properly positioned after the term is saved, ensuring accurate tracking of successful term creation.
138-139: Efficient gauge implementations with appropriate filtering.The lambda functions provide real-time counts efficiently. The public terms gauge correctly filters for
LIVEandREVIEWstates, which aligns with the business logic for published vocabularies.evoks/vocabularies/models.py (4)
13-13: Consistent import organization and state refactoring.The Prometheus client imports and the refactoring to use the centralized
Stateenum fromvocabularies.statemaintain consistency across the codebase and improve maintainability.Also applies to: 21-22
24-26: Comprehensive vocabulary metrics implementation.The three metrics provide excellent coverage for vocabulary tracking: creation events, total stored vocabularies, and publicly available vocabularies. The naming follows the established "evoks_" prefix convention.
130-130: Proper counter increment in create method.The counter increment is correctly placed after the vocabulary is saved and relationships are established, ensuring accurate tracking of successful vocabulary creation.
653-654: Well-implemented gauge functions with correct filtering.The lambda functions provide efficient real-time counts. The public vocabularies gauge correctly filters for
LIVEandREVIEWstates, which accurately represents vocabularies published in SKOSMOS.evoks/evoks/middleware.py (3)
1-19: LGTM - Well-structured imports and metric definitions.The imports are appropriate for the monitoring functionality. The Prometheus metrics follow proper naming conventions with descriptive help text.
72-88: LGTM - Proper dual authentication handling.The implementation correctly separates metrics endpoint authentication (basic auth) from regular application authentication (session-based). The logic flow ensures
/metrics/is protected while maintaining existing behavior for other endpoints.
20-88: Well-implemented monitoring middleware with minor improvements needed.The overall implementation successfully adds Prometheus monitoring capabilities while maintaining security. The middleware follows Django patterns and provides appropriate configuration options.
Key strengths:
- Privacy-conscious IP hashing
- Configurable via environment variables
- Proper separation of metrics and application authentication
- Clean integration with existing middleware stack
The suggested improvements (memory management, exception handling, security documentation) are important for production deployment but don't affect core functionality.
| class MonitoringMiddleware(MiddlewareMixin): | ||
| stored_ips = set() | ||
|
|
||
| class LoginRequiredMiddleware: | ||
| @staticmethod | ||
| def get_client_ip(request): | ||
| """Extracts client IP address from request.""" | ||
| x_forwarded_for = request.META.get('HTTP_X_FORWARDED_FOR') | ||
| if x_forwarded_for: | ||
| ip = x_forwarded_for.split(',')[0] | ||
| else: | ||
| ip = request.META.get('REMOTE_ADDR') | ||
| return ip | ||
|
|
||
| def process_request(self, request): | ||
| metrics_enabled = os.environ.get('METRICS_ENABLED', 'false') | ||
| if metrics_enabled != 'true': | ||
| return | ||
|
|
||
| path = request.path_info.lstrip('/') | ||
| if not any(url == path for url in MONITORING_EXEMPT_URLS): | ||
| users_total_counter.inc() | ||
| ip = self.get_client_ip(request) | ||
|
|
||
| if ip: | ||
| hashed_ip = hashlib.sha256(ip.encode()).hexdigest() | ||
| self.stored_ips.add(hashed_ip) | ||
| users_unique_gauge.set(len(self.stored_ips)) | ||
|
|
There was a problem hiding this comment.
🛠️ Refactor suggestion
Address potential memory growth in stored_ips.
The implementation correctly hashes IPs for privacy and handles proxy headers. However, the class-level stored_ips set will grow unbounded over time, potentially causing memory issues in long-running applications.
Consider implementing periodic cleanup or using a more memory-efficient approach:
class MonitoringMiddleware(MiddlewareMixin):
- stored_ips = set()
+ stored_ips = set()
+ _last_cleanup = 0
+ CLEANUP_INTERVAL = 86400 # 24 hours in seconds
def process_request(self, request):
metrics_enabled = os.environ.get('METRICS_ENABLED', 'false')
if metrics_enabled != 'true':
return
+ # Periodic cleanup to prevent memory growth
+ import time
+ current_time = time.time()
+ if current_time - self._last_cleanup > self.CLEANUP_INTERVAL:
+ self.stored_ips.clear()
+ self._last_cleanup = current_time
path = request.path_info.lstrip('/')
if not any(url == path for url in MONITORING_EXEMPT_URLS):
users_total_counter.inc()
ip = self.get_client_ip(request)
if ip:
hashed_ip = hashlib.sha256(ip.encode()).hexdigest()
self.stored_ips.add(hashed_ip)
users_unique_gauge.set(len(self.stored_ips))📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| class MonitoringMiddleware(MiddlewareMixin): | |
| stored_ips = set() | |
| class LoginRequiredMiddleware: | |
| @staticmethod | |
| def get_client_ip(request): | |
| """Extracts client IP address from request.""" | |
| x_forwarded_for = request.META.get('HTTP_X_FORWARDED_FOR') | |
| if x_forwarded_for: | |
| ip = x_forwarded_for.split(',')[0] | |
| else: | |
| ip = request.META.get('REMOTE_ADDR') | |
| return ip | |
| def process_request(self, request): | |
| metrics_enabled = os.environ.get('METRICS_ENABLED', 'false') | |
| if metrics_enabled != 'true': | |
| return | |
| path = request.path_info.lstrip('/') | |
| if not any(url == path for url in MONITORING_EXEMPT_URLS): | |
| users_total_counter.inc() | |
| ip = self.get_client_ip(request) | |
| if ip: | |
| hashed_ip = hashlib.sha256(ip.encode()).hexdigest() | |
| self.stored_ips.add(hashed_ip) | |
| users_unique_gauge.set(len(self.stored_ips)) | |
| class MonitoringMiddleware(MiddlewareMixin): | |
| stored_ips = set() | |
| _last_cleanup = 0 | |
| CLEANUP_INTERVAL = 86400 # 24 hours in seconds | |
| @staticmethod | |
| def get_client_ip(request): | |
| """Extracts client IP address from request.""" | |
| x_forwarded_for = request.META.get('HTTP_X_FORWARDED_FOR') | |
| if x_forwarded_for: | |
| ip = x_forwarded_for.split(',')[0] | |
| else: | |
| ip = request.META.get('REMOTE_ADDR') | |
| return ip | |
| def process_request(self, request): | |
| metrics_enabled = os.environ.get('METRICS_ENABLED', 'false') | |
| if metrics_enabled != 'true': | |
| return | |
| # Periodic cleanup to prevent memory growth | |
| import time | |
| current_time = time.time() | |
| if current_time - self._last_cleanup > self.CLEANUP_INTERVAL: | |
| self.stored_ips.clear() | |
| self._last_cleanup = current_time | |
| path = request.path_info.lstrip('/') | |
| if not any(url == path for url in MONITORING_EXEMPT_URLS): | |
| users_total_counter.inc() | |
| ip = self.get_client_ip(request) | |
| if ip: | |
| hashed_ip = hashlib.sha256(ip.encode()).hexdigest() | |
| self.stored_ips.add(hashed_ip) | |
| users_unique_gauge.set(len(self.stored_ips)) |
🤖 Prompt for AI Agents
In evoks/evoks/middleware.py between lines 20 and 47, the class-level stored_ips
set grows indefinitely, risking memory bloat in long-running processes. To fix
this, implement a mechanism to periodically clear or limit the size of
stored_ips, such as resetting it at intervals or using a data structure with a
maximum capacity. Alternatively, consider persisting unique IP counts externally
or using a time-based cache to remove stale entries, ensuring memory usage
remains controlled.
| def basic_auth(request: HttpRequest): | ||
| if 'HTTP_AUTHORIZATION' in request.META: | ||
| auth = request.META['HTTP_AUTHORIZATION'].split() | ||
| if len(auth) == 2: | ||
| if auth[0].lower() == "basic": | ||
| uname, passwd = base64.b64decode(auth[1]).split(b':') | ||
| req_uname = os.environ.get('METRICS_USER', 'metrics') | ||
| req_passwd = os.environ.get('METRICS_PASS', 'metricsPass') | ||
| if uname.decode() == req_uname and passwd.decode() == req_passwd: | ||
| return True | ||
| return False | ||
|
|
||
| @staticmethod |
There was a problem hiding this comment.
🛠️ Refactor suggestion
Add exception handling and simplify nested conditions.
The basic auth implementation needs error handling for malformed headers and can be simplified per static analysis suggestion.
@staticmethod
def basic_auth(request: HttpRequest):
if 'HTTP_AUTHORIZATION' in request.META:
auth = request.META['HTTP_AUTHORIZATION'].split()
- if len(auth) == 2:
- if auth[0].lower() == "basic":
+ if len(auth) == 2 and auth[0].lower() == "basic":
+ try:
uname, passwd = base64.b64decode(auth[1]).split(b':')
req_uname = os.environ.get('METRICS_USER', 'metrics')
req_passwd = os.environ.get('METRICS_PASS', 'metricsPass')
if uname.decode() == req_uname and passwd.decode() == req_passwd:
return True
+ except (ValueError, UnicodeDecodeError):
+ # Malformed base64 or invalid UTF-8
+ pass
return False📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| @staticmethod | |
| def basic_auth(request: HttpRequest): | |
| if 'HTTP_AUTHORIZATION' in request.META: | |
| auth = request.META['HTTP_AUTHORIZATION'].split() | |
| if len(auth) == 2: | |
| if auth[0].lower() == "basic": | |
| uname, passwd = base64.b64decode(auth[1]).split(b':') | |
| req_uname = os.environ.get('METRICS_USER', 'metrics') | |
| req_passwd = os.environ.get('METRICS_PASS', 'metricsPass') | |
| if uname.decode() == req_uname and passwd.decode() == req_passwd: | |
| return True | |
| return False | |
| @staticmethod | |
| def basic_auth(request: HttpRequest): | |
| if 'HTTP_AUTHORIZATION' in request.META: | |
| auth = request.META['HTTP_AUTHORIZATION'].split() | |
| if len(auth) == 2 and auth[0].lower() == "basic": | |
| try: | |
| uname, passwd = base64.b64decode(auth[1]).split(b':') | |
| req_uname = os.environ.get('METRICS_USER', 'metrics') | |
| req_passwd = os.environ.get('METRICS_PASS', 'metricsPass') | |
| if uname.decode() == req_uname and passwd.decode() == req_passwd: | |
| return True | |
| except (ValueError, UnicodeDecodeError): | |
| # Malformed base64 or invalid UTF-8 | |
| pass | |
| return False |
🧰 Tools
🪛 Ruff (0.12.2)
63-64: Use a single if statement instead of nested if statements
Combine if statements using and
(SIM102)
🤖 Prompt for AI Agents
In evoks/evoks/middleware.py around lines 59 to 71, the basic_auth method lacks
exception handling for malformed or invalid authorization headers, which can
cause runtime errors. Refactor the method to simplify nested if conditions by
combining checks where possible, and add try-except blocks around the base64
decoding and splitting operations to catch and handle errors gracefully,
returning False if any exception occurs.
| 'django.middleware.clickjacking.XFrameOptionsMiddleware', | ||
| 'evoks.middleware.LoginRequiredMiddleware', | ||
| 'evoks.middleware.PartOfVocabularyMiddleware', | ||
| 'evoks.middleware.MonitoringMiddleware' |
There was a problem hiding this comment.
💡 Verification agent
🧩 Analysis chain
Verify MonitoringMiddleware implementation.
The middleware is correctly positioned at the end of the middleware stack, which is appropriate for monitoring functionality.
Verify that the MonitoringMiddleware is properly implemented:
🏁 Script executed:
#!/bin/bash
# Description: Verify MonitoringMiddleware exists and is properly implemented
# Expected: Find the middleware class with proper Django middleware methods
ast-grep --pattern 'class MonitoringMiddleware:
$$$'
rg -A 15 "class MonitoringMiddleware" evoks/evoks/middleware.pyLength of output: 667
🏁 Script executed:
#!/bin/bash
# Check for required imports and response hook in MonitoringMiddleware
rg "^import os" evoks/evoks/middleware.py
rg -A 5 "def process_response" evoks/evoks/middleware.pyLength of output: 112
🏁 Script executed:
#!/bin/bash
# Check for MiddlewareMixin import in MonitoringMiddleware
rg -n "^from django.utils.deprecation import MiddlewareMixin" evoks/evoks/middleware.pyLength of output: 144
Add a process_response hook to MonitoringMiddleware
The MonitoringMiddleware class in evoks/evoks/middleware.py extends MiddlewareMixin and defines process_request, but there’s no process_response method. To capture end-to-end metrics (e.g., response status, latency), add a response hook:
• File: evoks/evoks/middleware.py
• Class: MonitoringMiddleware
Suggested diff:
class MonitoringMiddleware(MiddlewareMixin):
stored_ips = set()
@staticmethod
def get_client_ip(request):
...
def process_request(self, request):
...
+
+ def process_response(self, request, response):
+ """Record metrics after view execution, then return the response."""
+ # e.g., track response.status_code, latency, etc.
+ return response📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| 'evoks.middleware.MonitoringMiddleware' | |
| class MonitoringMiddleware(MiddlewareMixin): | |
| stored_ips = set() | |
| @staticmethod | |
| def get_client_ip(request): | |
| ... | |
| def process_request(self, request): | |
| ... | |
| def process_response(self, request, response): | |
| """Record metrics after view execution, then return the response.""" | |
| # e.g., track response.status_code, latency, etc. | |
| return response |
🤖 Prompt for AI Agents
In evoks/evoks/middleware.py at the MonitoringMiddleware class, add a
process_response method to capture response metrics such as status and latency.
Implement process_response(self, request, response) to record these details and
return the response object. This will complement the existing process_request
method and enable end-to-end monitoring.
| path('metrics/', metrics_view), | ||
| path('health/', health_view), |
There was a problem hiding this comment.
💡 Verification agent
🧩 Analysis chain
Verify security for metrics endpoint.
The URL patterns are correctly implemented. However, ensure the metrics endpoint is properly secured since it may expose sensitive application data.
Based on the relevant code snippets, the metrics_view checks METRICS_ENABLED environment variable but verify that authentication is properly implemented in the middleware:
🏁 Script executed:
#!/bin/bash
# Description: Verify authentication implementation for metrics endpoint
# Expected: Find authentication logic in middleware
rg -A 10 -B 5 "metrics.*auth|auth.*metrics" evoks/evoks/middleware.py
ast-grep --pattern 'def process_request($_, $_):
$$$
metrics
$$$'Length of output: 145
🏁 Script executed:
#!/bin/bash
# Locate the metrics_view implementation to inspect for authentication or decorators
rg -n -C5 "def metrics_view" -g "*.py" .Length of output: 714
Secure the metrics endpoint
The metrics_view in evoks/evoks/views.py is currently marked @csrf_exempt and has no authentication or access controls, exposing potentially sensitive metrics to any requester. Please lock this down.
• File: evoks/evoks/views.py (around lines 127–133)
• Issue: metrics_view is public and CSRF-exempt
• Recommendation: add an auth decorator or internal-only guard, for example:
- @csrf_exempt
- def metrics_view(request):
+ @login_required # or @staff_member_required
+ @csrf_exempt
+ def metrics_view(request):
metrics_enabled = os.environ.get('METRICS_ENABLED', 'false')
…• Alternatively, enforce IP whitelist or token auth within the view.
Committable suggestion skipped: line range outside the PR's diff.
🤖 Prompt for AI Agents
In evoks/evoks/urls.py at lines 63-64, the metrics endpoint is currently exposed
without authentication or CSRF protection. To fix this, update
evoks/evoks/views.py around lines 127-133 by removing the @csrf_exempt decorator
from metrics_view and adding an authentication decorator or implement an
internal-only access guard. Alternatively, add logic inside metrics_view to
enforce an IP whitelist or require a token for access to secure the endpoint.
Merge typo fixes into dev
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #278 +/- ##
=======================================
Coverage 92.35% 92.35%
=======================================
Files 50 50
Lines 1845 1845
=======================================
Hits 1704 1704
Misses 141 141 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (3)
README.md (3)
110-112: Tighten wording and fix minor readability nits.
Streamline phrasing and avoid the hard line break.-You can optionally enable monitoring to be able to collect some insights on the number of stored vocabularies and terms, as well as the number of unique and total accesses. These can be collected with [Prometheus](https://prometheus.io/) -and used with your monitoring setup. +You can optionally enable monitoring to collect insights on stored vocabularies and terms, as well as unique and total accesses. These can be scraped with [Prometheus](https://prometheus.io/) and integrated into your monitoring setup.
116-118: Emphasize safe defaults and provide a Prometheus scrape example.
Minor doc hardening; add example config for quicker adoption.The following environment variables are used to configure monitoring. Note that the monitoring endpoint is disabled by default and must be activated by setting `METRICS_ENABLED=true` in your environment variables. + +Example Prometheus scrape configuration: + +```yaml +scrape_configs: + - job_name: evoks + metrics_path: /metrics/ + basic_auth: + username: ${METRICS_USER} + password: ${METRICS_PASS} + static_configs: + - targets: ['localhost:9000'] # adjust to your host:port +``` + +Important: When enabling metrics, change the default credentials and restrict exposure to trusted networks.
119-124: Fix Markdown table formatting (remove empty column) and small grammar.
Current trailing empty column can render oddly in some viewers.-| VAR NAME | Default value | Description | Change recommended | | -|-------------------|--------------------|-----------------------------------------------------------|--------------------|---| -| METRICS_ENABLED | false | Set to `true` to enable the metrics endpoint | only if needed | | -| METRICS_USER | metrics | Username for metrics endpoint (HTTP Basic Authentication) | yes | | -| METRICS_PASS | metricsPass | Password for metrics endpoint (HTTP Basic Authentication) | yes | | +| VAR NAME | Default value | Description | Change recommended | +|-----------------|---------------|-----------------------------------------------------------|--------------------| +| METRICS_ENABLED | false | Set to `true` to enable the metrics endpoint | only if needed | +| METRICS_USER | metrics | Username for metrics endpoint (HTTP Basic Authentication) | yes | +| METRICS_PASS | metricsPass | Password for metrics endpoint (HTTP Basic Authentication) | yes |
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (1)
README.md(1 hunks)
🧰 Additional context used
🪛 LanguageTool
README.md
[grammar] ~110-~110: There might be a mistake here.
Context: ...ith Prometheus and used with your monitoring setup. To...
(QB_NEW_EN)
[grammar] ~116-~116: There might be a mistake here.
Context: ...nitoring endpoint is disabled by default and must be activated by setting `METRIC...
(QB_NEW_EN)
[grammar] ~119-~119: There might be a mistake here.
Context: ... | Change recommended | | |-------------------|-------------------...
(QB_NEW_EN)
[grammar] ~120-~120: There might be a mistake here.
Context: ...--------------|--------------------|---| | METRICS_ENABLED | false ...
(QB_NEW_EN)
[grammar] ~121-~121: There might be a mistake here.
Context: ... | only if needed | | | METRICS_USER | metrics ...
(QB_NEW_EN)
[grammar] ~122-~122: There might be a mistake here.
Context: ...thentication) | yes | | | METRICS_PASS | metricsPass ...
(QB_NEW_EN)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: test-all
🔇 Additional comments (1)
README.md (1)
108-108: Good addition: clear opt-in monitoring section headline.
Heading is concise and discoverable.
| To detect unique accesses, a hashed version of the users IP will be stored in memory (non-persistent) if monitoring is explicitly activated. Apart from the hashed IP, no sensitive user information is recorded. You can collect metrics yourself from the `/metrics` endpoint. Metrics will never be *sent* anywhere. | ||
| The metrics endpoint is secured with HTTP Basic Authentication, the credentials can be set using environment variables (see the table **below**). | ||
|
|
There was a problem hiding this comment.
💡 Verification agent
🧩 Analysis chain
Clarify IP “hashing” approach and require HTTPS for Basic Auth.
Unkeyed hashes of IPs are trivially reversible by dictionary. Recommend keyed HMAC and document HTTPS.
-To detect unique accesses, a hashed version of the users IP will be stored in memory (non-persistent) if monitoring is explicitly activated. Apart from the hashed IP, no sensitive user information is recorded. You can collect metrics yourself from the `/metrics` endpoint. Metrics will never be *sent* anywhere.
-The metrics endpoint is secured with HTTP Basic Authentication, the credentials can be set using environment variables (see the table **below**).
+To detect unique accesses, EVOKS stores in memory (non‑persistent) a keyed hash of the user's IP only when monitoring is enabled. Apart from this hash, no user information is recorded. You can collect metrics from the `/metrics/` endpoint. Metrics are never sent anywhere.
+The metrics endpoint is secured with HTTP Basic Authentication (credentials via environment variables; see the table below). Expose this endpoint only over HTTPS (e.g., behind the provided reverse proxy).To verify the actual implementation uses a keyed HMAC (e.g., HMAC‑SHA256 with a secret), please confirm in evoks/evoks/middleware.py. If not, I can provide a patch.
🏁 Script executed:
#!/bin/bash
# Verify metrics endpoint path and hashing approach
rg -nC2 -S "/metrics" evoks/evoks/urls.py
rg -nC3 -S "HMAC|SHA|hash" evoks/evoks/middleware.pyLength of output: 414
Use HMAC-SHA256 with a secret for IP hashing and mandate HTTPS for the metrics endpoint. Replace the unkeyed hashlib.sha256(ip.encode()).hexdigest() in evoks/evoks/middleware.py (line 44) with a keyed HMAC (e.g. hmac.new(os.environ["METRICS_HASH_SECRET"].encode(), ip.encode(), hashlib.sha256).hexdigest()), and update README.md to document the METRICS_HASH_SECRET env var and explicitly require that /metrics (HTTP Basic Auth) only be exposed over HTTPS.
🤖 Prompt for AI Agents
In README.md around lines 113–115, update the documentation to require that the
/metrics endpoint be exposed only over HTTPS and add a new environment variable
METRICS_HASH_SECRET (describe its purpose and that it must be set) so operators
know a secret is required for IP hashing; in evoks/evoks/middleware.py at line
44 replace the unkeyed SHA256 call with a keyed HMAC-SHA256 using
METRICS_HASH_SECRET (and add clear behavior when the env var is missing, e.g.,
fail startup or disable monitoring with a logged error), and note in the README
that METRICS_HASH_SECRET must be present and the metrics endpoint must be served
via HTTPS with HTTP Basic Auth.
Add possibility for metrics
Upgrade debian version of python image
Summary by CodeRabbit
New Features
Bug Fixes
Chores
.ideato .gitignore; added prometheus-client dependency; allowed additional host for containers.Refactor