The branching algorithm here can get quite slow where there are large numbers of walkers. This can be made faster by using what we did in RynLib (specifically here). That had to deal with a number of other possible variants, though, like branching because a walker's weight had become too high or too low, so the true core is in chop_weights which makes use of an initial partial sorting coming from argpartition and then is applied iteratively if necessary (it's almost never necessary to do more than one loop)
The branching algorithm here can get quite slow where there are large numbers of walkers. This can be made faster by using what we did in RynLib (specifically here). That had to deal with a number of other possible variants, though, like branching because a walker's weight had become too high or too low, so the true core is in
chop_weightswhich makes use of an initial partial sorting coming fromargpartitionand then is applied iteratively if necessary (it's almost never necessary to do more than one loop)