You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
- `mesh_pos`: Full temporal trajectory (no displacement reconstruction needed)
215
+
- `thickness`: Per-node features
216
+
- `edges`: Pre-computed edge connectivity (no edge rebuilding during training)
217
+
218
+
**NOTE:** All heavy preprocessing (node filtering, edge building, thickness computation) is done once during curation using PhysicsNeMo-Curator. The reader simply loads pre-computed arrays.
219
+
220
+
This format is directly compatible with the Zarr reader in this example.
221
+
186
222
## Training
187
223
188
224
Training is managed via Hydra configurations located in conf/.
@@ -277,14 +313,15 @@ If you use the graph datapipe, the edge list is produced by walking the filtered
277
313
278
314
### Built‑in VTP reader (PolyData)
279
315
280
-
In addition to `d3plot`, a lightweight VTP reader is provided in `vtp_reader.py`. It treats each `.vtp` file in a directory as a separate run and expects point displacements to be stored as vector arrays in `poly.point_data` with names like `displacement_t0.000`, `displacement_t0.005`, … (a more permissive fallback of any `displacement_t*` is also supported). The reader:
316
+
A lightweight VTP reader is provided in `vtp_reader.py`. It treats each `.vtp` file in a directory as a separate run and expects point displacements to be stored as vector arrays in `poly.point_data` with names like `displacement_t0.000`, `displacement_t0.005`, … (a more permissive fallback of any `displacement_t*` is also supported). The reader:
281
317
282
318
- loads the reference coordinates from `poly.points`
283
319
- builds absolute positions per timestep as `[t0: coords, t>0: coords + displacement_t]`
284
320
- extracts cell connectivity from the PolyData faces and converts it to unique edges
285
-
- returns `(srcs, dsts, point_data)` where `point_data` contains `'coords': [T, N, 3]`
321
+
- extracts all point data fields dynamically (e.g., thickness, modulus)
322
+
- returns `(srcs, dsts, point_data)` where `point_data` contains `'coords': [T, N, 3]` and all feature arrays
286
323
287
-
By default, the VTP reader does not attach additional features; it is compatible with `features: []`. If your `.vtp` files include additional per‑point arrays you would like to model (e.g., thickness or modulus), extend the reader to add those arrays to each run’s record using keys that match your `features` list. The datapipe will then concatenate them in the configured order.
324
+
The VTP reader dynamically extracts all non-displacement point data fields from the VTP file and makes them available to the datapipe. If your `.vtp` files include additional per‑point arrays (e.g., thickness or modulus), simply add their names to the `features` listin your datapipe config.
288
325
289
326
Example Hydra configuration for the VTP reader:
290
327
@@ -304,12 +341,58 @@ defaults:
304
341
- reader: vtp
305
342
```
306
343
307
-
And set `features` to empty (or to the names you add in your extended reader) in `conf/datapipe/point_cloud.yaml` or `conf/datapipe/graph.yaml`:
344
+
And configure features in `conf/datapipe/point_cloud.yaml` or `conf/datapipe/graph.yaml`:
308
345
309
346
```yaml
310
-
features: [] # or [thickness, Y_modulus] if your reader provides them
347
+
features: [thickness] # or [] for no features
311
348
```
312
349
350
+
### Built‑in Zarr reader
351
+
352
+
A Zarr reader provided in `zarr_reader.py`. It reads pre-processed Zarr stores created by PhysicsNeMo-Curator, where all heavy computation (node filtering, edge building, thickness computation) has already been done during the ETL pipeline. The reader:
353
+
354
+
- loads pre-computed temporal positions directly from `mesh_pos` (no displacement reconstruction)
355
+
- loads pre-computed edges (no connectivity-to-edge conversion needed)
356
+
- dynamically extracts all point data fields (thickness, etc.) from the Zarr store
357
+
- returns `(srcs, dsts, point_data)` similar to VTP reader
358
+
359
+
Data layout expected by Zarr reader:
360
+
- `<DATA_DIR>/*.zarr/`(each `.zarr` directory is treated as one run)
361
+
- Each Zarr store must contain:
362
+
- `mesh_pos`: `[T, N, 3]`temporal positions
363
+
- `edges`: `[E, 2]`pre-computed edge connectivity
364
+
- Feature arrays (e.g., `thickness`): `[N]`or `[N, K]` per-node features
365
+
366
+
Example Hydra configuration for the Zarr reader:
367
+
368
+
```yaml
369
+
# conf/reader/zarr.yaml
370
+
_target_: zarr_reader.Reader
371
+
```
372
+
373
+
Select it in `conf/config.yaml`:
374
+
375
+
```yaml
376
+
defaults:
377
+
- reader: zarr # Options are: vtp, d3plot, zarr
378
+
- datapipe: point_cloud # will be overridden by model configs
And configure features in `conf/datapipe/graph.yaml`:
386
+
387
+
```yaml
388
+
features: [thickness] # Must match fields stored in Zarr
389
+
```
390
+
391
+
**Recommended workflow:**
392
+
1. Use PhysicsNeMo-Curator to preprocess d3plot → VTP or Zarr once
393
+
2. Use corresponding reader for all training/validation
394
+
3. Optionally use d3plot reader for quick prototyping on raw data
395
+
313
396
### Data layout expected by readers
314
397
315
398
- d3plot reader (`d3plot_reader.py`):
@@ -320,6 +403,10 @@ features: [] # or [thickness, Y_modulus] if your reader provides them
320
403
- `<DATA_DIR>/*.vtp`(each `.vtp` is treated as one run)
321
404
- Displacements stored as 3‑component arrays in point_data with names like `displacement_t0.000`, `displacement_t0.005`, ... (fallback accepts any `displacement_t*`).
322
405
406
+
- Zarr reader (`zarr_reader.py`):
407
+
- `<DATA_DIR>/*.zarr/`(each `.zarr` directory is treated as one run)
408
+
- Contains pre-computed `mesh_pos`, `edges`, and feature arrays
409
+
323
410
### Write your own reader
324
411
325
412
To write your own reader, implement a Hydra‑instantiable function or class whose call returns a three‑tuple `(srcs, dsts, point_data)`. The first two entries are lists of integer arrays describing edges per run (they can be empty lists if you are not producing a graph), and `point_data` is a list of Python dicts with one dict per run. Each dict must contain `'coords'` as a `[T, N, 3]` array and one array per feature name listed in `conf/datapipe/*.yaml` under `features`. Feature arrays can be `[N]` or `[N, K]` and should use the same node indexing as `'coords'`. For convenience, a simple class reader can accept the Hydra `split` argument (e.g., "train" or "test") and decide whether to save VTP frames, but this is optional.
0 commit comments