Alternative interface to produce many events with varying initial states

This is an idea that arose out of a lunch discussion with @mohller today, who is using Chromo in CRPROPA, and from the discussion around https://github.com/impy-project/chromo/pull/174, and it is also related to #65 

The current Chromo API is designed around the use-case where you generate lots of a events with a fixed inital state (same particles at same ecm). However, there are also use cases where the initial state varies a lot, for example, when you simulate interactions in an air shower:

```py
for beam_energy, beam_particle, target_particle in some_state_stack:
     model = Sibyll23d(FixedTarget(beam_energy, beam_particle, target_particle))  # A
     event = next(model(1)): # B
     # process event, C
```

Running Chromo in this use-case is still efficient if the time spend in the model the generate and process one event (B, C) is large compared to the setup time of the model (A). This likely won't be the case, however, if `beam_energy` is small.

We could technically speed up Chromo for this use-case with an alternative API that pushes this for loop into C or Fortran. It is not a simple change, but it is technically possible.

The call would then look like this:
```py
events = Sibyll23d.process(some_state_stack)
```
where some_state_stack is a numpy array representating a table with the input state beam_energy, pid1, pid2 (we could add a flag to also support inputs in the center-of-mass frame), and events is a awkward array that contains the event data for each input state. We need the awkward array data structure here because we have a variable number of outgoing particles per input particle and we want to keep the association between input and output particles. Awkward arrays are able to represent such a mapping, while normal numpy arrays cannot.

The potential speed-up here is two-fold.
1) We are pushing the for loop in the first code example to fast Fortran code. This means we don't pay a Python call overhead each time that the model setup is changed.
2) We provide the output in form of an efficient data structure that allows us to pass all events at once to next processing step. Awkward arrays are arbitrarily flexible, but specialise to a very simple data structure in this case in which the events are essentially packed into a contiguous table, while a second internal array tracks the indices when one event ends and the next event starts. This data structure is still almost a numpy array and can be processed very efficiently, for example in numba. The simpler alternative of returning a Python list of numpy arrays would be inefficient, because the memory is then not contiguous.

CC @jpivarski

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Alternative interface to produce many events with varying initial states #188

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Alternative interface to produce many events with varying initial states #188

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions