Skip to content

Address performance issues with _check_disallowed_items overhead#176

Merged
danthedeckie merged 1 commit into
danthedeckie:mainfrom
meltano:1.0.5-perf
Mar 16, 2026
Merged

Address performance issues with _check_disallowed_items overhead#176
danthedeckie merged 1 commit into
danthedeckie:mainfrom
meltano:1.0.5-perf

Conversation

@edgarrmondragon
Copy link
Copy Markdown
Contributor

Description

What is this PR doing?

Downstream in meltano/sdk#3565 (comment), we noticed a performance regression in the 1.0.5 release of simpleeval.

The root problem seems to be that

  1. there too many redundant instance checks for safe primitive types
  2. is_hashable has try/except overhead.

The fix is for 1 is to implement a fast path for simple types. The fix for 2 is to replace is_hashable with callable, which achieves the same purpose in this context.

Benchmark script

Details

Before

============================================================
_check_disallowed_items  (1,000,000 calls each)
============================================================
Input type               Total    per call
---------------------------------------------
int                    0.157s     156.9ns
float                  0.173s     173.0ns
str                    0.171s     171.5ns
bool                   0.166s     166.4ns
None                   0.170s     169.9ns
list[int]              1.107s    1107.0ns
nested list            2.125s    2125.4ns
dict                   0.851s     850.9ns
callable               0.160s     159.7ns

============================================================
Full expression eval  (50,000 iterations each)
============================================================
Expression                    Total    per call
--------------------------------------------------
integer arithmetic          0.135s      2.69µs
float arithmetic            0.119s      2.38µs
string concat               0.097s      1.95µs
boolean logic               0.082s      1.64µs
comparison chain            0.161s      3.22µs
nested arithmetic           0.166s      3.32µs
ternary expression          0.050s      1.00µs
string methods              0.138s      2.77µs
list literal                0.195s      3.90µs
nested list                 0.426s      8.52µs
dict literal                0.190s      3.81µs

After

============================================================
_check_disallowed_items  (1,000,000 calls each)
============================================================
Input type               Total    per call
---------------------------------------------
int                    0.041s      41.4ns
float                  0.040s      39.5ns
str                    0.040s      39.6ns
bool                   0.039s      38.6ns
None                   0.039s      38.9ns
list[int]              0.264s     264.2ns
nested list            0.614s     614.3ns
dict                   0.270s     269.9ns
callable               0.154s     153.7ns

============================================================
Full expression eval  (50,000 iterations each)
============================================================
Expression                    Total    per call
--------------------------------------------------
integer arithmetic          0.067s      1.34µs
float arithmetic            0.056s      1.13µs
string concat               0.055s      1.10µs
boolean logic               0.041s      0.82µs
comparison chain            0.087s      1.73µs
nested arithmetic           0.087s      1.74µs
ternary expression          0.024s      0.48µs
string methods              0.102s      2.04µs
list literal                0.076s      1.52µs
nested list                 0.165s      3.30µs
dict literal                0.086s      1.72µs

Script:

"""
Benchmark for simpleeval performance regression introduced in 1.0.5,
specifically the _check_disallowed_items overhead.

Run with: python3 benchmark.py
"""

import timeit
from simpleeval import SimpleEval, EvalWithCompoundTypes

MICRO_N  = 1_000_000   # direct function calls
EVAL_N   =    50_000   # full expression evaluations

# ---------------------------------------------------------------------------
# 1. Microbenchmark: _check_disallowed_items in isolation
# ---------------------------------------------------------------------------
ev = SimpleEval()
check = ev._check_disallowed_items

micro_cases = [
    ("int",          42),
    ("float",        3.14),
    ("str",          "hello"),
    ("bool",         True),
    ("None",         None),
    ("list[int]",    [1, 2, 3, 4, 5]),
    ("nested list",  [[1, 2], [3, 4], [5, 6]]),
    ("dict",         {"a": 1, "b": 2, "c": 3}),
    ("callable",     len),
]

print("=" * 60)
print(f"_check_disallowed_items  ({MICRO_N:,} calls each)")
print("=" * 60)
print(f"{'Input type':<20}  {'Total':>8}  {'per call':>10}")
print("-" * 45)

for name, value in micro_cases:
    t = timeit.timeit(lambda v=value: check(v), number=MICRO_N)
    ns = t / MICRO_N * 1e9
    print(f"{name:<20}  {t:>6.3f}s  {ns:>8.1f}ns")

# ---------------------------------------------------------------------------
# 2. End-to-end expression evaluation
# ---------------------------------------------------------------------------
eval_cases = [
    (SimpleEval,            "integer arithmetic",  "1 + 2 * 3 - 4 // 2"),
    (SimpleEval,            "float arithmetic",    "1.5 * 2.0 + 3.14 / 2.0"),
    (SimpleEval,            "string concat",       "'hello' + ' ' + 'world'"),
    (SimpleEval,            "boolean logic",       "True and not False or True"),
    (SimpleEval,            "comparison chain",    "1 < 2 and 3 >= 3 and 4 != 5"),
    (SimpleEval,            "nested arithmetic",   "(1 + 2) * (3 + 4) * (5 + 6)"),
    (SimpleEval,            "ternary expression",  "42 if True else 0"),
    (SimpleEval,            "string methods",      "'  hello  '.strip().upper()"),
    (EvalWithCompoundTypes, "list literal",        "[1, 2, 3, 4, 5]"),
    (EvalWithCompoundTypes, "nested list",         "[[1, 2], [3, 4], [5, 6]]"),
    (EvalWithCompoundTypes, "dict literal",        "{'a': 1, 'b': 2, 'c': 3}"),
]

print()
print("=" * 60)
print(f"Full expression eval  ({EVAL_N:,} iterations each)")
print("=" * 60)
print(f"{'Expression':<25}  {'Total':>8}  {'per call':>10}")
print("-" * 50)

for evaluator, name, expr in eval_cases:
    e = evaluator()
    parsed = e.parse(expr)
    t = timeit.timeit(lambda: e.eval(expr, previously_parsed=parsed), number=EVAL_N)
    us = t / EVAL_N * 1e6
    print(f"{name:<25}  {t:>6.3f}s  {us:>8.2f}µs")

Pre-approval checklist (for submitter)

Please complete these steps

  • Passes tests
  • New tests for additional features or changed functionality
  • My name and contribution added to contributors list (or if I'd rather opt out, I've said so in the PR)

Downstream in meltano/sdk#3565 (comment), we noticed a performance regression in the 1.0.5 release of `simpleeval`.

The root problem seems to be that

1. there too many redundant instance checks for safe primitive types
2. `is_hashable` has try/except overhead.

The fix is for 1 is to implement a fast path for simple types. The fix for 2 is to replace `is_hashable` with [`callable`](https://docs.python.org/3/library/functions.html#callable), which achieves the same purpose in this context.

Signed-off-by: Edgar Ramírez Mondragón <edgarrm358@gmail.com>
@danthedeckie danthedeckie merged commit 050d4b2 into danthedeckie:main Mar 16, 2026
1 check passed
@edgarrmondragon edgarrmondragon deleted the 1.0.5-perf branch March 16, 2026 12:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants