What do we look at to verify the convergence of ASE's LLP?

In ASE training on locomotion datasets, I observe  Disc_Agent_Acc staying at ~0.95-0.99 even when visual rollouts are clean and episode length is at cap. Is this expected behavior given that the policy optimizes the MI objective rather than trying to fool the discriminator? 

I trained an ASE model on just four motions: walk, run, jog, and jump. Here is the result and log file. Also, in the visualization, the jump motion will always be interrupted by other locomotion skills. Can I call this situation Mode Collapse? 

Thank you for your help!!!

<img width="350" height="577" alt="Image" src="https://github.com/user-attachments/assets/33f30e4b-5dae-49f9-b1a6-450534e6313b" />

[log.txt](https://github.com/user-attachments/files/27954118/log.txt)



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

What do we look at to verify the convergence of ASE's LLP? #109

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

What do we look at to verify the convergence of ASE's LLP? #109

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions