in ram.py
eps = (self.xp.random.normal(0, 1, size=m.data.shape)).astype(np.float32)
l = m.data + np.sqrt(self.var)*eps
ln_pi = -0.5 * F.sum((l-m)*(l-m), axis=1) / self.var #log(location policy)
because according to jlindsey15/RAM#10
the sampled location l must not allow gradient flow, so I thought the above code which made ln_pi can't stop gradient , may have issue in it.
in ram.py
because according to jlindsey15/RAM#10
the sampled location l must not allow gradient flow, so I thought the above code which made ln_pi can't stop gradient , may have issue in it.