In ASE training on locomotion datasets, I observe Disc_Agent_Acc staying at ~0.95-0.99 even when visual rollouts are clean and episode length is at cap. Is this expected behavior given that the policy optimizes the MI objective rather than trying to fool the discriminator?
I trained an ASE model on just four motions: walk, run, jog, and jump. Here is the result and log file. Also, in the visualization, the jump motion will always be interrupted by other locomotion skills. Can I call this situation Mode Collapse?
Thank you for your help!!!
log.txt
In ASE training on locomotion datasets, I observe Disc_Agent_Acc staying at ~0.95-0.99 even when visual rollouts are clean and episode length is at cap. Is this expected behavior given that the policy optimizes the MI objective rather than trying to fool the discriminator?
I trained an ASE model on just four motions: walk, run, jog, and jump. Here is the result and log file. Also, in the visualization, the jump motion will always be interrupted by other locomotion skills. Can I call this situation Mode Collapse?
Thank you for your help!!!
log.txt