SE(3)-invariant model that predicts per-residue sidechain torsion distributions from backbone structure and sequence, trained on qFit multiconformer crystal structures.
| Symbol | Shape | Description |
|---|---|---|
| amino acid tokens, |
||
| backbone rotation matrix (to local frame, primary altloc) | ||
|
|
Local frames are constructed from
so that
The final single representation
with
Data. qFit multiconformer PDB files. Named altlocs (A, B,
Loss. Occupancy-weighted negative log-likelihood of the observed altloc
The training loss is this quantity averaged over the valid residue set
where:
-
$\mathcal{R} = { i : n_{\mathrm{obs},i} \geq 2 \text{ and } d_{\mathrm{eff},i} \geq 1 }$ is the set of residues with at least two observed altlocs and at least one defined$\chi$ angle. -
$o_a$ is the crystallographic occupancy of altloc$a$ , normalized so$\sum_a o_a = 1$ . -
$\pi_k$ and$\kappa_{dk}$ are the predicted mixture weight and concentration;$\chi_{dk} \equiv \mu_{kd}$ is the predicted von Mises mean direction of component$k$ on$\chi$ angle$d$ ;$\chi_{ad}$ is the observed angle of altloc$a$ . -
$D$ is the number of defined$\chi$ angles for the residue (i.e. the per-$\chi$ validity mask restricts the product), and$K$ the number of mixture components. -
$\frac{1}{2\pi I_0(\kappa)} \exp(\kappa \cos \Delta)$ is the von Mises density on the circle, with$I_0$ the modified Bessel function of the first kind, order$0$ . The log-normalizer is evaluated stably as$\log I_0(\kappa) = \log,\mathrm{i0e}(\kappa) + \kappa$ (torch.special.i0e), so large$\kappa$ stays finite. - The residual
$\Delta = \chi_{ad}-\chi_{dk}$ is taken on the circle. For$\chi$ angles with$\pi$ -rotational symmetry (ASP$\chi_2$ , GLU$\chi_3$ , PHE$\chi_2$ , TYR$\chi_2$ ) it is folded to$(-\pi/2, \pi/2]$ via$\tfrac{1}{2}, \mathrm{atan2}(\sin 2\Delta, \cos 2\Delta)$ .
Optimization. AdamW with
| Parameter | Symbol | Value |
|---|---|---|
| single representation | ||
| pair representation | ||
| IPA blocks | ||
| attention heads | ||
| head dim | ||
| query/key points | ||
| value points | ||
| von Mises components | ||
| chi dimensions | ||
| concentration floor | ||
| dropout | ||
| optimizer | --- | AdamW |
| learning rate | ||
| schedule | --- | cosine to |
| batch size | --- |