Skip to content

a bunch of iterations from Ben's colab notebooks and ideas#15

Open
charliederr wants to merge 43 commits intotrueagi-io:mainfrom
dort:experimental/continual_learning
Open

a bunch of iterations from Ben's colab notebooks and ideas#15
charliederr wants to merge 43 commits intotrueagi-io:mainfrom
dort:experimental/continual_learning

Conversation

@charliederr
Copy link
Copy Markdown
Collaborator

There's a lot here, so happy to refactor whatever is needed and/or do this all in several smaller stages if needed.

dort and others added 30 commits April 5, 2026 01:45
Added GNU General Public License version 2 to the LICENSE file, clarifying the terms under which the code is distributed.
…eight level for learning (standard much of the time, SB based where non-Gaussianity seems extreme)
The full training config had demotion_threshold=0.3 and promotion_threshold=0.7,
which were too high for larger shell configurations (32 neurons). Sinkhorn
transport with larger shells produces more diffuse distributions with lower
off-diagonal transport mass (typically 0.10-0.16).

Changes:
- Lower demotion_threshold from 0.3 to 0.10
- Lower promotion_threshold from 0.7 to 0.25
- Scale shell distance cost by 0.15 for better eps compatibility
- Track column usage history for shell dynamics
- Compute demotions before registering new state (compare against history)

Results: Shell demotions now 20 (was 0) while maintaining 97.76% accuracy.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Acknowledge inspiration from Benjamin Goertzel's PyTorch codebase.
…eight level for learning (standard much of the time, SB based where non-Gaussianity seems extreme)
The full training config had demotion_threshold=0.3 and promotion_threshold=0.7,
which were too high for larger shell configurations (32 neurons). Sinkhorn
transport with larger shells produces more diffuse distributions with lower
off-diagonal transport mass (typically 0.10-0.16).

Changes:
- Lower demotion_threshold from 0.3 to 0.10
- Lower promotion_threshold from 0.7 to 0.25
- Scale shell distance cost by 0.15 for better eps compatibility
- Track column usage history for shell dynamics
- Compute demotions before registering new state (compare against history)

Results: Shell demotions now 20 (was 0) while maintaining 97.76% accuracy.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
A standalone benchmark package for evaluating continual learning algorithms.
Works with any deep learning framework (PyTorch, JAX, TensorFlow, NumPy).

Features:
- Simple model protocol: predict() and train_on_task()
- Built-in datasets: Split-MNIST, Permuted-MNIST, Split-CIFAR10/100
- Comprehensive metrics: accuracy matrix, forgetting, BWT, FWT
- Statistical tests: paired t-test, Wilcoxon, bootstrap CI, Cohen's d
- Reference baselines: Naive fine-tuning, EWC (pure NumPy)
- Visualization: accuracy heatmaps, forgetting analysis, comparisons
- PROTOCOL.md specification document

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Update pyproject.toml dependencies: matplotlib -> plotly>=5.0.0
- Convert fabricpc/continual/utils.py plotting functions to plotly
- Convert cl-benchmark visualization/plots.py to plotly
- Update example comment to reflect plotly usage

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Consolidate duplicated Sinkhorn algorithm implementations from transweave.py
and weight_causal.py into a new optimal_transport.py module. This reduces
code duplication (~200 lines removed) and provides a single source of truth
for optimal transport utilities.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Replace NumPy with JAX for performance-critical computations:
- optimal_transport.py: Use jax.numpy and lax.fori_loop for Sinkhorn
- weight_causal.py: JAX-accelerate kurtosis, multimodal gap, and correction
- transweave.py: Add lax import for consistency

Maintains NumPy fallback when JAX is unavailable.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Creates CausalLinear, TransWeaveLinear, and CausalTransWeaveLinear nodes
that embed continual learning directly into FabricPC's forward_learning()
method. Uses singleton registries (CausalGradientRegistry, TransWeaveRegistry)
to maintain gradient history while keeping node methods pure/stateless.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants