Hi,
I am new to ML but have been researching MuZero. Thanks to your article i found SIMPLE. My question is: is it possible to create a policy network that takes into account that both players will choose a move to make ? Because the next state relies on both players choices there is a bit of nuance in the policy network where it will have to account for the opponents and their might be a dependency between them.
I am not sure if it is built out of the box or if there is any research in this regard.
Thanks