-
Notifications
You must be signed in to change notification settings - Fork 0
Description
Now, we have the core logic for an eager backend, a lazy engine, and an optimizer, it's the perfect time to structure this into a proper framework. This involves organizing the code into logical files and planning for future growth.
With this structure, you can now plan for new features. Here is a roadmap:
A. More Operations
-
Linear Algebra:
- Matrix Multiplication (
@): Fully implement theMultiplyOpto callmultiply_eager. - Transpose (
.T): Add aTransposeOp. An interesting optimization is that a transpose can often be "free" by just changing how tiles are read, without writing a new matrix. - Element-wise Functions: Add
exp,log,sqrt, etc. These are simple to fuse.
- Matrix Multiplication (
-
Reductions:
sum(),mean(): These are fundamental. They are also interesting because they change the output shape from a matrix to a vector or scalar.max(),min(): Implement these along different axes.
B. A Smarter Optimizer ✨
-
Rule-Based Optimizer: Instead of hard-coding the
if isinstance(...)check inside one operation, create a list of optimization rules. Your optimizer would have a registry like:- Rule 1: If the plan matches
Multiply(Add(A, B), Scalar), execute withfused_add_multiply_kernel. - Rule 2: If the plan matches
Transpose(Transpose(A)), rewrite the plan to justA.
- Rule 1: If the plan matches
-
Cost-Based Optimizer: For truly advanced systems, you would implement a cost model. For an operation like
A @ B @ C, the optimizer would calculate the estimated I/O cost of both(A @ B) @ CandA @ (B @ C)and choose the cheaper execution path based on the matrix shapes.
C. An Enhanced Execution Engine
-
Parallelism: Use Python's
concurrent.futures.ThreadPoolExecutorto process tiles in parallel and saturate a fast NVMe drive, orProcessPoolExecutorto use multiple CPU cores for computation-heavy tiles.
This roadmap transforms the project from a simple script into a conceptual blueprint for a real computational framework. The process of thinking about this architecture is just as valuable as your work on SystemDS, as it deals with the same fundamental design patterns of separating the logical plan from the physical execution.