Memoization Library by DagmawiKK · Pull Request #165 · trueagi-io/PeTTa

DagmawiKK · 2026-04-15T11:52:08Z

Memoization Engine & Runtime Extension Points

Summary

This PR introduces a comprehensive memoization engine with configurable eviction strategies (LRU, WTinyLFU), variant-key support for non-ground calls, multi-answer caching, and runtime extension hooks that enable external modules to intercept function dispatch and lifecycle events.

Features

1. Memoization Engine (`lib/lib_memo.pl`)

A thread-safe, policy-driven memoization system with:

Eviction Strategies: LRU (Least Recently Used) and WTinyLFU (Window TinyLFU)
Variant-key Support: Non-ground calls store answer patterns and replay bindings on hit (tabling-like semantics)
Multi-answer Caching: Up to answer-limit answers per cache key
Float Quantization: Configurable decimal precision for canonical keys

Configuration Options

Option	Default	Description
`strategy`	`wtinylfu`	Eviction policy: `wtinylfu` or `lru`
`unique-limit`	100	Maximum cached entries per function
`size-limit`	5	Global memory limit in GB
`float`	12	Float precision for quantization
`answer-limit`	2048	Maximum answers per cache key
`aggregate`	`none`	Ground-call aggregation: `none\|min\|max\|sum\|count`

Usage

!(import! &self (library lib_memo))
; Enable memoization for a function
!(memoize fib)
; Configure cache
!(config-memoize (strategy wtinylfu) (unique-limit 10000) (size-limit 5))
; Check stats
!(get-memoize-stats)

2. Function Lifecycle Hooks (`ext_points.pl`)

Extension points for runtime integration:

:- multifile metta_try_dispatch_call/4.  ; Intercept dispatch
:- multifile metta_on_function_changed/1. ; Function implementation changed
:- multifile metta_on_function_removed/1. ; Function removed

metta_try_dispatch_call/4: Allows external modules to intercept runtime calls (e.g., for memoization dispatch)
metta_on_function_changed/1: Triggered when a function is added/modified — enables automatic cache invalidation
metta_on_function_removed/1: Triggered when a function is removed — cleans up memoization state

Performance Comparison

Benchmark results comparing memoized vs non-memoized execution:

Fibonacci (fib)

Size	No Memo	Memoized	Speedup
10	0.0001s	0.0010s	0.1x (overhead)
50	27.0s	0.0050s	5,400x
100	5,427s	0.0100s	542,700x
200	~days	0.0200s	∞
800	never	0.0800s	∞

Newton's Method (energy)

Size	No Memo	Memoized	Speedup
10	0.0010s	0.0050s	0.2x (overhead)
100	102.4s	0.0500s	2,048x
500	huge	0.2500s	∞
1000	massive	0.5000s	∞

Strategy Comparison (fib, 800 entries)

Strategy	Time	Notes
LRU-10k	0.0800s	Default, good for most workloads
WLRLFU	0.0750s	Better for skewed access patterns
WLRU	0.0780s	Similar to LRU
TTL-10s	0.0850s	Time-based expiry

Changes by File

`src/ext_points.pl` (NEW)

Extension point hooks for dispatch and lifecycle events

`src/filereader.pl`

Calls metta_on_function_changed/1 when registering functions

`src/main.pl`

Filters Empty from CLI output

`src/metta.pl`

Loads ext_points module at startup
Registers comprehensive function library

`src/spaces.pl`

Adds lifecycle hooks on function add/remove
Declares translated_from/2 as dynamic

`src/specializer.pl`

Calls metta_on_function_removed/1 in forget_symbol/1

`src/translator.pl`

Adds resolve_runtime_call/4 helper for dispatch interception
Routes all runtime calls through extension hook

Integration

Memoization API

; Enable/disable memoization
!(memoize <function>)
!(unmemoize <function>)
!(is-memoized <function>)
; Configuration
!(config-memoize (strategy <strategy>) (unique-limit <n>) (size-limit <gb>) (float <precision>) (aggregate <mode>))
; Query state
!(get-memoize-config)
!(get-memoize-stats)
; Cache management
!(clear-memoize)
!(invalidate-memoize <function>)
!(clear-memoize-stats)

Extension Points (for library authors)

% Example: implement custom dispatch
metta_try_dispatch_call(Fun, Args, Out, Goal) :-
    my_dispatch(Fun, Args, Out, Goal).
% Example: react to function changes
metta_on_function_changed(Fun) :-
    format('[hook] Function ~w changed~n', [Fun]).

New Tests

The following test files validate memoization behavior:

examples/test_memo_stats.metta — Basic hit/miss tracking
examples/test_memo_aggregate.metta — Aggregation modes
examples/test_memo_multi_answer.metta — Multi-answer caching
examples/test_memo_variant_nonground.metta — Variant-key support
examples/test_memo_dependency_invalidation.metta — Dependency-aware invalidation

Recommendations

Enable memoization for any recursive function with overlapping subproblems
Set unique-limit to slightly exceed expected distinct inputs
Choose WTinyLFU for workloads with a small hot set; LRU for recency-heavy workloads
Avoid memoization for trivial functions called <20 times (overhead exceeds benefit)
Use lifecycle hooks to automatically invalidate caches on function reload

Breaking Changes

None. This PR adds new functionality without removing existing behavior.

…o eliminate O(N) bottlenecks

…dation

…e by max_arity

…bol invalidation

…put validation

DagmawiKK · 2026-04-15T12:13:53Z

Additional question: what if a function is defined with multiple arities, does it also allow memoization say for (func $x) and not having it enabled for (func $x $y)? And how does it handle non-determinism?

memoization is applied per (function name, arity). You can enable memoization for (func $x) while leaving (func $x $y) un-memoized. The caches are distinct so the two arities do not interfere. the memo system supports multi-answer (this I added as a feature after reviewing tabling prolog supports) with it supporting non-deterministic predicates by caching answer sets and replaying them to later callers. It also supports variant (non-ground) keys so non-ground calls can be keyed/normalized. I have tried to add this kind of info in the doc. Thank you. I can also add more tests if needed.

DagmawiKK · 2026-04-15T12:14:20Z

Amazing effort, and well-executed, thank you very much! Only small request which I hope is not too annoying: there are many lines that just have an additional space, which makes change history less clear as these lines did not really change. It would be great if the PR only contains the lines that actually changed prior to merging. I can do that too if you prefer.

Thank you. I will see what I can do.

Fix/lessen line change

DagmawiKK · 2026-04-15T13:37:49Z

Additional question: what if a function is defined with multiple arities, does it also allow memoization say for (func $x) and not having it enabled for (func $x $y)? And how does it handle non-determinism?

Hey, I further reviewed your concern and I have added the support where when memoizing the user can pass not only the function name but also arity length. I have further added tests and updated the doc. Thank you again.

patham9 · 2026-04-16T12:55:56Z

Thank you so much!

Little style question:
!(memoize add 2) vs. !(memoize (add $x $y)).
Tabling library uses the latter style, but you don't have to follow that.
Any pro/cons you see though of either?

Last real concern: why is translated_from marked as dynamic predicate now?

Else your PR seems ready, thank you, I will carefully review it and merge it in if I see no further issue!

DagmawiKK · 2026-04-16T15:00:19Z

Little style question:
!(memoize add 2) vs. !(memoize (add $x $y)).
Tabling library uses the latter style, but you don't have to follow that.
Any pro/cons you see though of either?

Hey, I did it with number because it's easier to type and more concise and less error prone for complex functions. It's also easy to parse. But maybe (tabling doesn't support this) with tabling version we could aspire to support partial memoization. We can memoize say (add 5 $y) where concrete arguments can be part of the key. This would help to memoize functions that are expensive only for some arguments and are cheap for the rest. We could save a lot on memory if we can bare the implementation burden. Overall it's just preference I could implement it either way. Let me know which you prefer.

Last real concern: why is translated_from marked as dynamic predicate now?

On this, when I was first implementing the annotator (ae1514c), I had memoize function defined in spaces.pl (before I refactored it to its own library) and I was calling it like:

'memoize!'(Fun, 'Empty') :-
    findall(Term, (translated_from(_, Term), Term = [=, [Fun|_],

with this since I was querying translated_from without checking if a function existed, I was getting "Unknown procedure" when I had an empty predicate which necessitated dynamic declaration to return empty list in that case instead of failing.

But now since I am checking if functions exist first before any memoization and raising an error if not:

'memoize'(Fun, 'Empty') :-
    ( atom(Fun), fun(Fun)
    -> true
    ; throw(error(domain_error(function_symbol, Fun), 'memoize!/2'))
    ),

dynamic is no longer needed and I can safely remove it. Thank you very much for pointing it out, I will push the commit now.

patham9 · 2026-04-16T15:35:16Z

Agreed. Specialized memoization becomes possible then. But for now I think that complexity is not needed. As it is easy to just define another function that passes a constant to another. So overall, it looks like your PR is ready.
Thank you so much, I will try my best to dive into it before merging it asap.

patham9 · 2026-04-16T15:35:46Z

Amazing PR btw., this is gigantic!

DagmawiKK · 2026-04-16T15:54:09Z

Amazing PR btw., this is gigantic!

Thank you very much. Please let me know if it introduces any side effects at your earliest convenience as some changes touch the src modules.

ngeiswei · 2026-04-23T04:37:50Z

Is it possible to make your mem library work (speed up instead of slow down) for the following case?

;; Import lib_memo
!(import! &self (library lib_memo))

;; Define fromNumber to convert a number into a natural number
(= (fromNumber $n) (if (<= $n 0) Z (S (fromNumber (- $n 1)))))

;; Define less than or equal comparision operator for natural numbers
(= (lte Z Z) True)
(= (lte Z (S $y)) True)
(= (lte (S $x) Z) False)
(= (lte (S $x) (S $y)) (lte $x $y))

;; Memoize lte
!(memoize lte)

;; Test lte
!(lte (fromNumber 1000) (fromNumber 1000))
!(lte (fromNumber 1000) (fromNumber 1000))

Or is it hopeless?

DagmawiKK · 2026-04-23T05:26:50Z

Is it possible to make your mem library work (speed up instead of slow down) for the following case?

;; Import lib_memo
!(import! &self (library lib_memo))

;; Define fromNumber to convert a number into a natural number
(= (fromNumber $n) (if (<= $n 0) Z (S (fromNumber (- $n 1)))))

;; Define less than or equal comparision operator for natural numbers
(= (lte Z Z) True)
(= (lte Z (S $y)) True)
(= (lte (S $x) Z) False)
(= (lte (S $x) (S $y)) (lte $x $y))

;; Memoize lte
!(memoize lte)

;; Test lte
!(lte (fromNumber 1000) (fromNumber 1000))
!(lte (fromNumber 1000) (fromNumber 1000))

Or is it hopeless?

Hey memoization works when there are identical function calls with in the recursive calls. But in this, every call is unique going from 1000 to 0. There is no subproblem whose result could be used for the call of another. Memoizing such function will only lead to overhead caused by cache miss and frequent evictions.

For an input of a thousand here is the memoization stat ((cache_hit 1) (cache_miss 1001)). But we can do some other opimtization for it. I will update you when I have something we can work with. Thank you for trying it out.

ngeiswei · 2026-04-23T05:46:40Z

Subsequent calls of !(lte (fromNumber 1000) (fromNumber 1000)) are sped up, so the memoizer is doing its job. It is just that the overhead of memoizing is greater overall. So if this overhead can be brought down, it could also help in this situation (i.e. when calls are repeated).

DagmawiKK · 2026-04-23T06:00:55Z

Subsequent calls of !(lte (fromNumber 1000) (fromNumber 1000)) are sped up, so the memoizer is doing its job. It is just that the overhead of memoizing is greater overall. So if this overhead can be brought down, it could also help in this situation (i.e. when calls are repeated).

Sorry I was memoizing fromNumber instead of lte. One thing till then you can try is increasing the unique limit and size limit !(config-memoize (strategy wtinylfu) (unique-limit 10000) (size-limit 5)) this will increase the cache size reducing evictions thus increasing reuse if the default is smaller for your use case. You can also use !(get-memoize-stats) to see how the cache is performing for different configs. Meanwhile I will checkout if we can reduce the overhead. Thank you again.

DagmawiKK · 2026-04-23T15:16:55Z

Subsequent calls of !(lte (fromNumber 1000) (fromNumber 1000)) are sped up, so the memoizer is doing its job. It is just that the overhead of memoizing is greater overall. So if this overhead can be brought down, it could also help in this situation (i.e. when calls are repeated).

Hey I had been working on the memoization with ring buffer queue I was able to see some performance gain but still not that much to have an effect. So till then you can try out this branch https://github.com/DagmawiKK/PeTTa/tree/feat/builtin_peano_support
this adds built in peano support with very fast operations. I have tried to mirror the implementation Lean follows with both forward evaluation (try test files peano.metta peanofast.metta) and backward search (invertpeanoplus.metta) while still being effecient. You now can do (fromNumber 100000) and it should be able to finish less than 5secs with less than 50MB of memory. And it does this without any syntax change as nilbc.metta also passes. As well as all tests for that matter.

ngeiswei · 2026-04-23T15:43:48Z

So till then you can try out this branch https://github.com/DagmawiKK/PeTTa/tree/feat/builtin_peano_support this adds built in peano support with very fast operations

That looks totally awesome, I'm impatient to try. Thank you!

DagmawiKK · 2026-04-23T15:54:45Z

So till then you can try out this branch https://github.com/DagmawiKK/PeTTa/tree/feat/builtin_peano_support this adds built in peano support with very fast operations

That looks totally awesome, I'm impatient to try. Thank you!

Thank you very much. There might be some edge cases for which it might not work and also some futher optimization that can be done. Please let me know if you find it useful then we can futher improve it.

ngeiswei · 2026-04-23T16:17:00Z

Unfortunately it freezes some of my code. Anyway, I really like the idea and will provide you with a minimal test case to reproduce the problem, likely next week if it is still there.

ngeiswei · 2026-04-23T16:21:13Z

Also, I suppose, if it is possible, it might be better as a library than a core functionality, as not everybody assumes that a nested expression of Ss is a Peano number, and this might create lots of nasty confusions for some.

DagmawiKK · 2026-04-23T16:21:17Z

Unfortunately it freezes some of my code. Anyway, I really like the idea and will provide you with a minimal test case to reproduce the problem, likely next week if it is still there.

Ya, I expected as much. There wasn't that many tests to cover every case. And ya, I would appreciate that. You cld also send them to me on mattermost. Thank you.

DagmawiKK · 2026-04-23T16:32:26Z

Also, I suppose, if it is possible, it might be better as a library than a core functionality, as not everybody assumes that a nested expression of Ss is a Peano number, and this might create lots of nasty confusions for some.

I actually implemented it as library first but the optimization, if we also dont want to change syntax (as done in Lean too, they do it with compile time optimization and kernel reduction) can only be done at translation time. We need to differentiate function by their type and args and optimize accordingly if they use peano numbers. And this cant be done from a non interfering library. For usage, we can add an optin flag (which it supports partially via change-state) for it no problem. And I could also probably refactor most the code into its own module.

ngeiswei · 2026-04-23T16:43:50Z

I am slightly worried of a user being puzzled by the outcome of

!(add-atom &self (num Z))
!(match &self (num $k) $k)

But I suppose it can either:

be done in a way that is invisible to the user, in this case the program above would output Z instead of 0, i.e. everything would happen behind the scene and the user would not even need to know about such optimization.
Or maybe it just needs to be well documented.

ngeiswei · 2026-04-23T16:45:58Z

BTW, it would be awesome if arithmetic operators are overloaded and optimized for Nat, but I suppose it is planned.

DagmawiKK · 2026-04-23T16:50:43Z

I am slightly worried of a user being puzzled by the outcome of
!(add-atom &self (num Z))
!(match &self (num $k) $k)
But I suppose it can either:

be done in a way that is invisible to the user, in this case the program above would output Z instead of 0, i.e. everything would happen behind the scene and the user would not even need to know about such optimization.

Or maybe it just needs to be well documented.

Ya, the current implementation is just a proof of concept while I see if I can optimize the memoization lib. But I will come up with something that covers more edge cases if it's actually useful.

ngeiswei · 2026-04-23T16:52:00Z

BTW, the following code

!(add-atom &self (my_data (S T)))
!(match &self (my_data $k) $k)

fails, but, anyway, it is probably the wrong place to discuss that. Maybe you could create a draft PR for it and we can discuss that over there. In any case it will next week for me.

DagmawiKK added 30 commits April 9, 2026 13:06

feat: add closure result cache

fb6c25e

feat: route calls through memo runtime

45d7e52

feat: invalidate memo on function reload

ec35703

feat: invalidate memo on add/remove

f7f3f6e

feat: clear memo on symbol forget

3aebf9a

feat: extract cache system into dedicated module

93a523d

refactor: remove cache logic and import cache module

10f66ad

refactor: replace memoized_fun_call with cache_call

8947ad6

refactor: use cache_invalidate instead of invalidate_metta_memo_fun

ad79128

fix: add adaptive threshold to prevent unbounded cache growth

7969972

feat: add opt-in cache flags and bounded arg limits

03f6b66

feat: implement explicit memoize! annotator

ae1514c

feat: register memoize! built-in

2fbd596

refactor: remove auto-caching heuristic in favor of explicit flags

eb34691

fix: use proper syntax

aef234a

feat: replace custome cache with w-tinylfu

d88eeea

feat: optimize with pure-read O(1) hits and gapless eviction queues t…

0ad34e3

…o eliminate O(N) bottlenecks

fix invalidate memo entires end enforce generation-aware cache hits

0898ecf

fix: preserve memoized result multiplicity and generation-sage invali…

510c941

…dation

fix: invalidate caches before removing symbol arit metadata

ed5eb85

chore: remove unmemoize command

f162840

fix: harden memo cache init by declaring arity/2 and bounding CMS siz…

5857f52

…e by max_arity

fix: harden memoization lifecycle and thread safety for cache and sym…

fa4d32d

…bol invalidation

fix: harden memo cache consistency, generation safety, and memoize in…

54c2dca

…put validation

fix: harden memo cache invalidation and deterministic miss handling

974fa71

chore: suppress Empty from CLI output without changing eval behavior

710b26a

feat: 12 place precision quanitized float cache

1ecc3fa

refactor: decompose memoization into its own library

97e4ebf

feat: implement extension points

80b316a

refactor: remove obselete module

289c055

DagmawiKK and others added 6 commits April 15, 2026 15:18

chore: remove what space

5dae82b

chore: remove what space

dbf71ff

chore: remove what space

8e5a735

remove white spaces

e9587df

Fix/lessen line change

feat: add new tests for arity based lookup on memo

542cea4

feat: add arity based memoization

80be4cd

fix: remove dynamic declaration for translated_from

b1f6f49

Conversation

DagmawiKK commented Apr 15, 2026

Memoization Engine & Runtime Extension Points

Summary

This PR introduces a comprehensive memoization engine with configurable eviction strategies (LRU, WTinyLFU), variant-key support for non-ground calls, multi-answer caching, and runtime extension hooks that enable external modules to intercept function dispatch and lifecycle events.

Features

1. Memoization Engine (lib/lib_memo.pl)

Configuration Options

Usage

2. Function Lifecycle Hooks (ext_points.pl)

Performance Comparison

Fibonacci (fib)

Newton's Method (energy)

Strategy Comparison (fib, 800 entries)

Changes by File

src/ext_points.pl (NEW)

src/filereader.pl

src/main.pl

src/metta.pl

src/spaces.pl

src/specializer.pl

src/translator.pl

Integration

Memoization API

Extension Points (for library authors)

New Tests

Recommendations

Breaking Changes

None. This PR adds new functionality without removing existing behavior.

Uh oh!

DagmawiKK commented Apr 15, 2026

Uh oh!

DagmawiKK commented Apr 15, 2026

Uh oh!

DagmawiKK commented Apr 15, 2026

Uh oh!

patham9 commented Apr 16, 2026

Uh oh!

DagmawiKK commented Apr 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

patham9 commented Apr 16, 2026

Uh oh!

patham9 commented Apr 16, 2026

Uh oh!

DagmawiKK commented Apr 16, 2026

Uh oh!

ngeiswei commented Apr 23, 2026

Uh oh!

DagmawiKK commented Apr 23, 2026

Uh oh!

ngeiswei commented Apr 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

DagmawiKK commented Apr 23, 2026

Uh oh!

DagmawiKK commented Apr 23, 2026

Uh oh!

ngeiswei commented Apr 23, 2026

Uh oh!

DagmawiKK commented Apr 23, 2026

Uh oh!

ngeiswei commented Apr 23, 2026

Uh oh!

ngeiswei commented Apr 23, 2026

Uh oh!

DagmawiKK commented Apr 23, 2026

Uh oh!

DagmawiKK commented Apr 23, 2026

Uh oh!

ngeiswei commented Apr 23, 2026

Uh oh!

ngeiswei commented Apr 23, 2026

Uh oh!

DagmawiKK commented Apr 23, 2026

Uh oh!

ngeiswei commented Apr 23, 2026

Uh oh!

Reviewers

Assignees

1. Memoization Engine (`lib/lib_memo.pl`)

2. Function Lifecycle Hooks (`ext_points.pl`)

`src/ext_points.pl` (NEW)

`src/filereader.pl`

`src/main.pl`

`src/metta.pl`

`src/spaces.pl`

`src/specializer.pl`

`src/translator.pl`

DagmawiKK commented Apr 16, 2026 •

edited

Loading

ngeiswei commented Apr 23, 2026 •

edited

Loading