fix and feature: clean MVMextractmode by DasVinch · Pull Request #381 · milk-org/milk

DasVinch · 2026-06-02T21:44:14Z

THIS IS A DEMO PR WITH NO INTENTION TO MERGE AS AS

This PR is a re-write of MVMextractModes.c that showcases a number of interesting approaches for future development.
Summary of changes:

Refactor linalgebra/MVMextractModes.c to finish removing unused or irrelevant attributes. Sanitize a few segfaults, remove references to fetching images into milk-data by ImageID. [must go to upstream]
Remove test file that was not actually testing our production code [upstream]
There's also some fixing to streamDelay.c that got bundled here through a few merges.
Create a separate module linalgebra2 really just for linalgebra2/MVMextractModes.cpp as a workspace
Create a test environment adapted to pytest used in conjuction with pyMilk under /testing

What I showcase

C++

We can introduce C++ code TUs in a (strong) C environment. That needed ~5 changes to the rest of the codebase (restrict / __restrict and (char*) forceful string literal casts)
Functionalization and a moderate amount of object-oriented C++ greatly helps readability of large functions, and not getting lost in long if/else block.
Calculations are displaced to a secondary file. API contracts are clear: mask is loaded, matrix is loaded, and the function backend::matrixMul() has a clear API contract: take data from the float-casted input array and write it to the output stream. See mvm_auxiliaries.hpp|cpp
Functions that are extracted, some of which are purely technical in nature e.g. the cuda initialization, can be library-fied for better reuse.
All data in C++ object can be arranged to be de'alloced automatically (that's the unique_ptr). Since the logic is in the destructor, when arguments are optionally used / optionally nullptr, it's much easier to keep track.
The rewrite helped to see that the current MVMextract was not abiding by it's arguments very well, and not consistently (masking = 1 + axmode = 1, etc)

TEST SUITE

testing/ is configured as a python module, for tests. testing/tests is a pytest suite (that contains nothing meaningful but the layout) that can be run with /testing>$ pytest
testing/tests/mains files are not executed by default, but pytest can be used to invoke them and run dev sessions. This is what I did for now, in a single file experiment_testing.
Using parametrization, it's easy to run the same test function many times with different options
Using fixtures, it's easy to cause changes to the environment, e.g. spoofing MILK_SHM_DIR during the tests
Using fixtures, it's easy to offer reproducible / recyclable setup/teardown codes to a number of tests. Here I use it to serve the FPS for the MVM to 72 different executions of the function

def mvm_correctness_all_params(
    fps_m: FPS,
    axmode: int,
    normalize: bool,
    GPUindex: int,
    masking: bool,
    input_dtype: np.typing.DTypeLike,
):

Finally, this has allowed me to 1/ check that the new MVM in cpp is correct compared to a very concise python implementation for all combinations of axmode, normalization, gpu/cpu/OPENMP/BLAS, input dtype, masking.
and that the existing MVM is correct at least for axmode 0, BLAS/GPU, masking, normalization
and get some timing printouts:

>$ # LINALG_ORIGINAL CPU
>$ pytest -sx tests/mains/experiment_deploy.py -k 'test_mvm_correctness[False-99-False] and TestsAxmode0' | grep ELA
ELAPSED [MODE EXTRACT] [GPUindex=99  normalize=0     masking=0     input_dtype=float32 ] 1477.60 ms
>$ # LINALG_ORIGINAL GPU
>$ pytest -sx tests/mains/experiment_deploy.py -k 'test_mvm_correctness[False-0-False] and TestsAxmode0' | grep ELA
ELAPSED [MODE EXTRACT] [GPUindex=0   normalize=0     masking=0     input_dtype=float32 ] 507.18 ms
>$ # LINALG_NEW CPU
>$ pytest -sx tests/mains/experiment_deploy.py -k 'test_mvm_correctness[False-99-False] and TestsAxmode0' | grep ELA
ELAPSED [MODE EXTRACT] [GPUindex=99  normalize=0     masking=0     input_dtype=float32 ] 1006.73 ms
>$ # LINALG_NEW GPU
>$ pytest -sx tests/mains/experiment_deploy.py -k 'test_mvm_correctness[False-0-False] and TestsAxmode0' | grep ELA
ELAPSED [MODE EXTRACT] [GPUindex=0   normalize=0     masking=0     input_dtype=float32 ] 499.78 ms
>$ # LINALG_ORIGINAL GPU MASKED
>$ pytest -sx tests/mains/experiment_deploy.py -k 'test_mvm_correctness[True-0-False] and TestsAxmode0' | grep ELA
ELAPSED [MODE EXTRACT] [GPUindex=0   normalize=0     masking=1     input_dtype=float32 ] 292.73 ms
>$ # LINALG_NEW GPU MASKED
>$ pytest -sx tests/mains/experiment_deploy.py -k 'test_mvm_correctness[True-0-False] and TestsAxmode0' | grep ELA
ELAPSED [MODE EXTRACT] [GPUindex=0   normalize=0     masking=1     input_dtype=float32 ] 296.04 ms

Timings 1/3 better on CPU (mostly because masking got factored in), a few % on GPU.

What's not great (in this impementation of the MVM): error checking.

Nits:

Re-added Ctrl+A in fpsCTRL to attach tmux session.
Many string buffers for printing need increases in size to get rid of compiler warnings. I just touched one.

… of CONST_WORD

…ranch

DasVinch and others added 15 commits May 29, 2026 15:50

parking

ff7db7f

streamDelay naive mode + some basic deployment testing

f7c28d5

Refactor to linalgebra2

4584f9a

Round 2: auxiliaries written

75d11f1

linalgebra cleanup

6218ca0

Fix usage of restrict (prepare C++ terrain); TODO: gather definitions…

e5ac09f

… of CONST_WORD

linalgebra2: draft rework of MVMextract -- needs merge with testing b…

4462579

…ranch

Merge branch 'vd/testing' into vd/clean_mvm

af599ef

Cleanup of linalgebra2

2ebd484

(char*) cast that enable C and C++ compliant compilation

1b0b6f3

pytest for MVM

41ef7a5

.gitignore Intellisense config file

916a2ab

Clean MVM in linalg so that at least some tests with axmode 0 run.

b7e2a1f

CMakeLists.txt that went through the gaps of the gitignore

2944517

chore: auto-format code [skip ci]

4c090fe

DasVinch assigned oguyon Jun 2, 2026

DasVinch added 2 commits June 2, 2026 23:53

Flagging of output.write

f5ce2fc

Implement a CUDA graph approach

b88fdba

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix and feature: clean MVMextractmode#381

fix and feature: clean MVMextractmode#381
DasVinch wants to merge 17 commits into
framework-devfrom
vd/clean_mvm

DasVinch commented Jun 2, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

DasVinch commented Jun 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

DasVinch commented Jun 2, 2026 •

edited

Loading