Skip to content

Conversation

@snickerjp
Copy link
Member

What type of PR is this?

  • Feature

Description

Added functionality to exclude datasource types that don't need or can't perform schema refresh from the schema refresh process.

Background

Some datasource types (results, python, etc.) don't implement the get_schema method, causing NotSupported exceptions during schema refresh, which generates error logs and metrics.

Error logs before fix:

[WARNING] Failed refreshing schema for the data source: Query Results
Traceback (most recent call last):
  File "/app/redash/tasks/queries/maintenance.py", line 166, in refresh_schema
    ds.get_schema(refresh=True)
  File "/app/redash/query_runner/__init__.py", line 232, in get_schema
    raise NotSupported()
redash.query_runner.NotSupported
[INFO] task=refresh_schema state=failed ds_id=1 runtime=0.00

[WARNING] Failed refreshing schema for the data source: python
Traceback (most recent call last):
  ...
redash.query_runner.NotSupported
[INFO] task=refresh_schema state=failed ds_id=2 runtime=0.00

These datasources don't have the concept of schema, so they should be excluded from the beginning.

Changes

Flow Diagram

Before Fix:

flowchart TD
    Start[refresh_schemas start] --> Loop{Each datasource}
    Loop --> Paused{paused?}
    Paused -->|Yes| SkipPaused[Skip: paused]
    Paused -->|No| Blacklist{blacklist?}
    Blacklist -->|Yes| SkipBlacklist[Skip: blacklist]
    Blacklist -->|No| OrgDisabled{org.is_disabled?}
    OrgDisabled -->|Yes| SkipOrg[Skip: org_disabled]
    OrgDisabled -->|No| Execute[Execute refresh_schema]
    Execute --> Error{NotSupported exception}
    Error -->|results/python| ErrorLog[❌ Error logs]
    Error -->|pg/mysql etc| Success[✅ Success]
    SkipPaused --> Loop
    SkipBlacklist --> Loop
    SkipOrg --> Loop
    ErrorLog --> Loop
    Success --> Loop
    Loop --> End[Complete]
Loading

After Fix:

flowchart TD
    Start[refresh_schemas start] --> Loop{Each datasource}
    Loop --> Paused{paused?}
    Paused -->|Yes| SkipPaused[Skip: paused]
    Paused -->|No| Blacklist{blacklist?}
    Blacklist -->|Yes| SkipBlacklist[Skip: blacklist]
    Blacklist -->|No| TypeExcluded{type in EXCLUDED_TYPES?}
    TypeExcluded -->|Yes| SkipType[✅ Skip: type_excluded]
    TypeExcluded -->|No| OrgDisabled{org.is_disabled?}
    OrgDisabled -->|Yes| SkipOrg[Skip: org_disabled]
    OrgDisabled -->|No| Execute[Execute refresh_schema]
    Execute --> Success[✅ Success]
    SkipPaused --> Loop
    SkipBlacklist --> Loop
    SkipType --> Loop
    SkipOrg --> Loop
    Success --> Loop
    Loop --> End[Complete]
Loading

Implementation Details

  1. New Setting

    • SCHEMAS_REFRESH_EXCLUDED_TYPES: Set of datasource types to exclude
    • Environment variable: REDASH_SCHEMAS_REFRESH_EXCLUDED_TYPES
    • Default value: "results,python" (two types that definitely cause errors)
  2. Schema Refresh Logic Update

    • Added type exclusion check in refresh_schemas() function
    • Excluded types are logged with reason=type_excluded
    • Maintains consistency with existing exclusion mechanisms (blacklist, paused, org.is_disabled)

Benefits

  • Reduces unnecessary error logs and metrics
  • Prevents wasteful endpoint access
  • Improves schema refresh process efficiency

Usage

Default Behavior

Without setting environment variable, results and python are automatically excluded.

Exclude Additional Types (.env file)

REDASH_SCHEMAS_REFRESH_EXCLUDED_TYPES=results,python,json,url

How is this tested?

  • Unit tests (pytest)
  • Manually

Unit Tests

New test:

  • test_skips_excluded_datasource_types: Verifies excluded types are correctly skipped

Existing test compatibility:

  • test_calls_refresh_of_all_data_sources: PASSED
  • test_skips_paused_data_sources: PASSED

Test Execution Results:

3 passed, 21 warnings in 9.07s
✅ test_calls_refresh_of_all_data_sources PASSED
✅ test_skips_excluded_datasource_types PASSED
✅ test_skips_paused_data_sources PASSED

Manual Testing (Verification)

Test Steps:

  1. Create results and python datasources
  2. Execute refresh_schemas()
  3. Check logs

Execution Command:

docker compose exec worker python -c "
from redash import create_app
from redash.tasks.queries.maintenance import refresh_schemas
from redash import models

app = create_app()
with app.app_context():
    print('=== Data sources ===')
    for ds in models.DataSource.query:
        print(f'ID={ds.id} Name={ds.name} Type={ds.type}')
    print()
    print('=== Running refresh_schemas ===')
    refresh_schemas()
"

Execution Logs:

=== Data sources ===
ID=1 Name=Query Results Type=results
ID=2 Name=python Type=python
ID=3 Name=redash Type=pg

=== Running refresh_schemas ===
[INFO] task=refresh_schemas state=start
[INFO] task=refresh_schema state=skip ds_id=1 reason=type_excluded
[INFO] task=refresh_schema state=skip ds_id=2 reason=type_excluded
[INFO] task=refresh_schemas state=finish total_runtime=0.01

Verification Results:

  • results and python correctly skipped (no errors)
  • pg (PostgreSQL) executes normally (not appearing in logs is normal)
  • ✅ Error logs and stack traces completely eliminated

Related Tickets & Documents

Fixes #7571

Mobile & Desktop Screenshots/Recordings (if there are UI changes)

N/A (backend-only changes)


Additional Information

Implementation Approach

Initially attempted to automatically detect the presence of get_schema method, but abandoned due to:

  • hasattr() cannot detect because get_schema exists in BaseQueryRunner
  • Checking method override is complex and has low maintainability
  • Exception catching approach has performance impact

Therefore, adopted explicit type name specification approach. This approach:

  • Simple and easy to understand
  • Works reliably
  • Flexible control via environment variables
  • Consistent with other Redash settings (like ENABLED_QUERY_RUNNERS)

Datasource Types That Don't Need Schema Refresh

The following types don't implement get_schema method and are candidates for exclusion:

  • results - Query Results (references other query results)
  • python - Python execution
  • And potentially many others

Backward Compatibility

  • Default value automatically excludes results and python in existing environments
  • Can revert to previous behavior (attempt all datasources) by setting environment variable to empty string
  • Does not affect existing exclusion mechanisms (blacklist, paused, org.is_disabled)

- Add SCHEMAS_REFRESH_EXCLUDED_TYPES setting with default 'results,python'
- Add type-based exclusion check in refresh_schemas()
- Prevents unnecessary errors for datasources without schema support
@yoshiokatsuneo
Copy link
Contributor

yoshiokatsuneo commented Nov 14, 2025

Thank you for your PR with the detailed description !

Just a question.

Exception catching approach has performance impact

May I hear what kind of performance impact you are worrying ?
I just thought there is also an option to ignore NotSupported exception.

@snickerjp
Copy link
Member Author

snickerjp commented Nov 14, 2025

Thank you for the question!

You're right - the performance impact of exception catching would be minimal in this case. The concern was more about the implementation approach rather than actual performance.

The exception catching approach would look like:

try:
   ds.query_runner.get_schema(get_stats=False)
   refresh_schema.delay(ds.id)
except NotSupported:
   logger.info("skip: no schema support")

However, this approach has a conceptual issue: we'd be calling get_schema() just to check if it's supported, which feels wrong because:

  1. get_schema() is meant to actually retrieve schema, not to check capability
  2. Even with get_stats=False, it might still initialize connections or perform setup
  3. It's semantically unclear - the code looks like it's trying to get schema, but it's actually just checking support

Additionally, when there are many datasources:

  • Exception catching would call get_schema() for every datasource during refresh_schemas() execution (every 30 minutes by default)
  • Some query runners might initialize connections when accessing the query_runner property
  • Python exception handling has overhead (stack unwinding, traceback creation)

With type-based exclusion:

  • Skip check happens before any query runner instantiation
  • O(1) set lookup: ds.type in EXCLUDED_TYPES
  • No method calls, no exceptions, no overhead

That said, the performance difference is likely negligible in practice. The main benefit is code clarity and avoiding unnecessary method calls.

If the maintainers prefer the exception catching approach for better automatic detection, I'm happy to change it. What do you think?

@yoshiokatsuneo
Copy link
Contributor

yoshiokatsuneo commented Nov 15, 2025

@snickerjp

Thank you very much for you detailed explanation !

However, this approach has a conceptual issue: we'd be calling get_schema() just to check if it's supported, which feels wrong because:
get_schema() is meant to actually retrieve schema, not to check capability

Yes, but I'm just feeling, if we just ignore the exception, it is not "checking" but just "ignoring".

Even with get_stats=False, it might still initialize connections or perform setup

I think, at least for query_results / python data sources you described, calling get_schema() does not initialize the connections.

It's semantically unclear - the code looks like it's trying to get schema, but it's actually just checking support

I'm just feeling that at the point we ignore the error, the original issue was already solved.

Exception catching would call get_schema() for every datasource during refresh_schemas() execution (every 30 minutes by default)

Yes, it might be meaningless. (Although, performance impact will be minimum.)

Python exception handling has overhead (stack unwinding, traceback creation)

I think the impact is very little. (Probably less than 0.1sec ?)

What I'm feeling is that the attributes(ex: schema listing is supported or not.) for each Data Source is nice to be encapsulated inside each Data Source class, is not defined at the global variables, if possible.
If we need to detect whether each Data Source support get_schema or not, I think we may add a method(ex: "is_get_method_supported"?) to the each Data Source class. (Although it make be the change bigger, and I'm not sure whether it is worth to do when the main issue(error logging) is already solved.)

How about ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Schema Refresh Fails for Certain Datasource Types

2 participants