Skip to content

Commit e218f5b

Browse files
committed
feat: Add Pi0.5 inference support with PaliGemmaWithExpert architecture
Core Model Implementation: - Add PI0_5_Policy model with 32-dimensional action space support - Implement discrete state input processing for robotics tasks - Add PaliGemmaWithExpert backbone with AdaRMSNorm for flow matching - Support 16-step action prediction with temporal modeling Inference Configuration: - Add comprehensive inference configuration templates - Support both high-level and detailed inference configs - Include tokenizer and action dimension parameters - Maintain compatibility with existing Pi0 inference patterns Technical Details: - Extended Pi0 architecture for expert-enhanced multimodal reasoning - Flow matching timestep injection with adaRMS normalization - Vision-language-action model with discrete state integration Code Quality: - Applied Black code formatting - Fixed trailing whitespace and line endings - Ensured isort import organization compliance
1 parent da4935e commit e218f5b

File tree

4 files changed

+1060
-1
lines changed

4 files changed

+1060
-1
lines changed

examples/pi0_5/conf/inference.yaml

Lines changed: 26 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,26 @@
1+
defaults:
2+
- _self_
3+
- inference: pi0_5
4+
5+
experiment:
6+
exp_name: PI05_Inference
7+
seed: 42
8+
exp_dir: outputs/${experiment.exp_name}
9+
ckpt_format: torch
10+
task:
11+
type: inference
12+
backend: pi0_5
13+
entrypoint: flagscale/inference/inference_engine.py
14+
runner:
15+
per_node_task: false
16+
no_shared_fs: false
17+
rdzv_backend: static
18+
hostfile: null
19+
cmds:
20+
before_start: echo "Starting PI05 Inference"
21+
envs:
22+
LOGLEVEL: "INFO"
23+
CUDA_VISIBLE_DEVICES: "0"
24+
OTEL_SDK_DISABLED: true
25+
26+
action: run
Lines changed: 37 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,37 @@
1+
engine:
2+
device: cuda
3+
loader: pi0_5
4+
model: /share/pi0_5/pi05_base
5+
output_format: image
6+
results_path: ${experiment.exp_dir}/results
7+
stat_path: /share/pi05_dataset/stats.json
8+
tokenizer: /share/paligemma-3b-pt-224
9+
torch_dtype: float32
10+
transformations:
11+
LogIOTransformation:
12+
log_level: info
13+
StateScopeTransformation: {}
14+
experiment:
15+
exp_dir: outputs/${experiment.exp_name}
16+
exp_name: pi0_5_inference
17+
seed: 42
18+
generate:
19+
action_dim: 32
20+
action_steps: 16
21+
batch_size: 2
22+
discrete_state_input: true
23+
images_keys:
24+
- observation.images.camera0
25+
- observation.images.camera1
26+
- observation.images.camera2
27+
images_shape:
28+
- 3
29+
- 480
30+
- 640
31+
instruction:
32+
task:
33+
- retro serie of different cars with different colors and shapes, mdjrny-v4 style
34+
- retro serie of different cars with different colors and shapes
35+
tokenizer_max_length: 200
36+
pi05_mode: true
37+
state_key: observation.state

0 commit comments

Comments
 (0)