Conversation
There was a problem hiding this comment.
Pull request overview
This PR removes the high-level Workload abstraction and shifts the execution/benchmarking utilities to operate on lower-level inputs (payload MLIR module + schedule modules), adding optional memory-manager support and callback hooks for device/host transfers. It also renames/relocates workload-related code into lighthouse/execution and updates examples accordingly.
Changes:
- Removed
Workloadand the oldworkload/runner.py; introducedlighthouse/execution/runner.pywithexecute/benchmarktaking low-level abstractions. - Added optional memory-manager integration (
GPUMemoryManager/ShardMemoryManager) and callback support in runner APIs. - Updated examples to the new API and moved “workload” examples to “execution”.
Reviewed changes
Copilot reviewed 15 out of 15 changed files in this pull request and generated 7 comments.
Show a summary per file
| File | Description |
|---|---|
| lighthouse/workload/workload.py | Deletes the abstract Workload base class. |
| lighthouse/workload/runner.py | Deletes old workload runner utilities in favor of new execution runner. |
| lighthouse/schedule/xegpu/mlp_schedule.py | Adds a new parameter presence assertion in schedule generation. |
| lighthouse/execution/runner.py | Adds new low-level execution/benchmark runner with optional memory management + callbacks. |
| lighthouse/execution/memory_manager.py | Adjusts ShardMemoryManager.get_input_buffers signature/validation. |
| lighthouse/execution/init.py | Updates exports for the new execution package API. |
| examples/xegpu/xegpu_workload.py | Removes example-specific XeGPUWorkload base class. |
| examples/xegpu/tune_matmul_gridsearch.py | Migrates tuning to new benchmarking/execution helpers. |
| examples/xegpu/mlp.py | Migrates MLP example to new execution API and adds local correctness checking + verbosity flag. |
| examples/xegpu/matmul.py | Migrates matmul example to new execution API and adds local execute/check + run_benchmark helpers. |
| examples/xegpu/lit.local.cfg | Removes exclusion for deleted xegpu_workload.py. |
| examples/workload/example_mlir.py | Removes the old workload-based MLIR allocation example. |
| examples/feed-forward-mpi/feed-forward-mpi.py | Migrates distributed FF example to new execution API with callback-based correctness flow. |
| examples/execution/example.py | Updates CPU elementwise example to use new execution runner and performs correctness checking inline. |
| examples/cpu/x86/matmul.py | Migrates x86 matmul example to new execution runner API and adds inline correctness checking. |
Comments suppressed due to low confidence (1)
lighthouse/execution/memory_manager.py:245
get_input_buffersnow gives required parameters defaultNonevalues and then enforces them viaassert. This makes the signature misleading and the checks can be skipped withpython -O. Prefer keepingkinds_and_ranks/elem_typerequired (no defaults) or raisingValueErrorwhen they are missing.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
rengolin
left a comment
There was a problem hiding this comment.
So, I don't mind it being a bunch of helpers for now. As you say, this needs to evolve organically.
Just checking: the idea is that the "lower_payload" would be either the compiler driver or the tuner driver (so removed from here later) and then this would become a class of its own that both drivers would use to run/benchmark stuff, right?
| # CHECK: Throughput: | ||
| """ | ||
| Workload example: Element-wise sum of two (M, N) float32 arrays on CPU. | ||
| Kernel execution example: Element-wise sum of two (M, N) float32 arrays on CPU. |
There was a problem hiding this comment.
Do we need a slightly different example than the two existing matmul / mlp ones (x86, xe)?
Workloadclass is removed. Examples continue to use a local class for convenience.executeandbenchmarknow take low-level abstractions as input.MemoryManagerclass type to handle allocations.callback(memory_manager, inputs)function that can be used to copy device buffers to host, for example.lighthouse/workload->lighthouse/executionexamples/workload->examples/executionThe execute/benchmark interface is not very nice at the moment but we can clean it up once we have more concrete use cases, for example, by binding some of the arguments together.