FsGrid & fieldsolver refactoring #1099

cscjlan · 2025-02-17T11:33:52Z

FsGrid

Vlasiator has been updated to use the refactored FsGrid. This is visible everywhere FsGrid was used in. Some notable differences are dicussed below.

First, there's only a single FsGrid object and it's called fsgrid. It no longer stores any data. The data that used to be stored by the different grids are now owned by a structure called FsData, one for each ex-grid. The data is not passed around by passing the FsData structures, but rather by passing an std::span. You can get an std::span from FsData with the view() method: auto technicalSpan = technicalData.view();

updateGhostCells now takes in an std::span: fsgrid.updateGhostCells(technical); instead of technicalGrid.updateGhostCells().

Data is no longer accessed through the grid.get(i, j, k) method, but rather by contructing an FsStencil, and using it to compute a 1D index from a 3D index:

auto stencil = fsgrid.makeStencil(i, j, k);
// Get the center element ('ooo'), i.e. the one corresponding to i, j, k
const auto center = technical[stencil.ooo()];

// The element in the -x direction from the center, i.e. i - 1, j, k
const auto x_left = technical[stencil.moo()];

This has the benefit that some values that are constant throughout the simulation can be computed once and stored in the fsgrid object. A single FsStencil is constructed for each iteration of a loop (i.e. one for each i, j, k). The constant values can be accessed by the FsStencil when it computes the 1D value from the 3D index it was constructed with.

Fieldsolver

Fieldsolver has been refactored to use the updated FsGrid. Some files have undergone heavier refactoring (ldz_electric_field.cpp), some less heavy. The underlying reason for all of the refactoring is to make it easier to change the fieldsolver to use the updated FsGrid: with less code repetition there's less work to be done (and thus fewer bugs) when changing the code to use the stencil and the spans. An additional benefit was a speed up especially in the electric field computation.

derivatives.cpp and ldz_volume.cpp use the newly added fsgrid.parallel_for, which is an architecture/processor independent way of doing the desired computation. It takes in a lambda, which it calls for each local cell on the fsgrid.

…on to a lambda

… of dmoments::dPed was computed regardless of shouldCalculateMoments flag

…es; Cache bool check to a variable

…d of fsgrid

…U porting and physically iffy.

ykempf · 2025-08-12T06:43:47Z

All right I seem to have botched something(s). However nominally the last 3 commits should have dealt with

getting rid of the numerous variants of parallel_for_* loop interfaces
removing RNG inits for perturbed B
removing capturing gridSpacing by using the newly passed coordinates.

ykempf · 2025-08-12T06:46:02Z

Oh yeah forgot to push out the fsgrid commit. 🫠

backgroundfield/backgroundfield.cpp

This reverts commit 8596d7f.

vlasiator.cpp

fieldsolver/ldz_main.cpp

grid.cpp

ioread.cpp

iowrite.cpp

fieldsolver/ldz_magnetic_field.cpp

ykempf · 2025-08-20T08:20:16Z

The rabbit hole of #1161 convinced me that the testpackage diffs are very unlikely to have arisen from this PR. I will thus proceed with the remaining open things here.

(essentially reverting PR fmihpc#1110)

iowrite.cpp

ursg

Incomplete review so far, I'll go through some more later. But functional approval, of course.
I'm not fully happy with the style in a number of places, but those are things that can... settle over time.

ursg · 2025-09-11T10:56:54Z

backgroundfield/backgroundfield.cpp

+   fsgrid.parallel_for([](int timerId) -> phiprof::Timer { return phiprof::Timer{timerId}; },
+   phiprof::initializeTimer("setBackgroundFieldToZero"), technical,
+   [=](const fsgrid::Coordinates &coordinates, const fsgrid::FsStencil& stencil, cuint sysBoundaryFlag, cuint sysBoundaryLayer) {
+      for (size_t i = 0; i < bgb[stencil.ooo()].size(); i++) {
+         bgb[stencil.ooo()][i] = 0.0;


This parallel for has inconsistent indentation with the others above.

ursg · 2025-09-11T11:03:21Z

datareduction/datareducer.cpp

+      if (P::systemWriteAllDROs || lowercase == "fg_b" ||
+          lowercase == "b") { // Bulk magnetic field at Yee-Lattice locations


I'm completely ok with most of your autoreformattings in this PR, but breaking all these lines in the datareducer file actually makes the whole source file harder to read. Is this something you can semi-easily revert by hand, or should I do that?

These were all done automatically by Juhana's software, according to the repo's style file. In things I did manually, I tried to stick to some sort of "what was done so far in the team", but I did not revert Juhana's automatic formatting. I can provide the style file I came up with when trying to improve it. I would really strongly suggest from this side of the bay that pliiis, deploy some CI tool doing all formatting for everyone, or at least a style file that works for all. However I stumbled upon options that changed in recentish versions of clang formatting, and/or git-clang-format was not available on major platforms we use (LUMI), so I think the more robust way would be a CI pre-commit or similar feature holding the ground truth and making everything consistent. At the price of having to create that ground truth file...

As @ykempf mentioned in a comment down the line, many of these could be "reverted" quite easily by increasing the allowed line width.

"They're more like guidelines anyway".

I'm all in favour of allowing for longer lines here, because the readability actually benefits from the better formatting in many cases.

So! Set up some way or other for this to be rolled out automatically, fix the formatting options to your desire, and off we go! Some options allow or disallow e.g. one parameter per line when breaking, but also some options were recent enough not to be available on all platforms, so I suppose having one central reference instance will be easiest to maintain (I tried installing git-clang-format on some machine without having to build the whole of LLVM and failed with that...).

ursg · 2025-09-11T11:07:14Z

datareduction/datareducer.cpp

+            outputReducer->addOperator(new DRO::DataReductionOperatorPopulations<Real>(
+                pop + "/vg_rho", i, offsetof(spatial_cell::Population, RHO), 1));
+            outputReducer->addMetadata(outputReducer->size() - 1, "1/m^3", "$\\mathrm{m}^{-3}$",
+                                       "$n_\\mathrm{" + pop + "}$", "1.0");


likewise, breaking these unit lines just makes things unneccesarily hard to read.

ursg · 2025-09-11T11:08:46Z

datareduction/datareducer.cpp

+         outputReducer->addOperator(new DRO::DataReductionOperatorBVOLDerivatives("vg_derivatives/vg_dperbxvoldx",
+                                                                                  bvolderivatives::dPERBXVOLdx, 1));
+         outputReducer->addOperator(new DRO::DataReductionOperatorBVOLDerivatives("vg_derivatives/vg_dperbxvoldy",
+                                                                                  bvolderivatives::dPERBXVOLdy, 1));
+         outputReducer->addOperator(new DRO::DataReductionOperatorBVOLDerivatives("vg_derivatives/vg_dperbxvoldz",
+                                                                                  bvolderivatives::dPERBXVOLdz, 1));
+         outputReducer->addOperator(new DRO::DataReductionOperatorBVOLDerivatives("vg_derivatives/vg_dperbyvoldx",
+                                                                                  bvolderivatives::dPERBYVOLdx, 1));
+         outputReducer->addOperator(new DRO::DataReductionOperatorBVOLDerivatives("vg_derivatives/vg_dperbyvoldy",
+                                                                                  bvolderivatives::dPERBYVOLdy, 1));
+         outputReducer->addOperator(new DRO::DataReductionOperatorBVOLDerivatives("vg_derivatives/vg_dperbyvoldz",
+                                                                                  bvolderivatives::dPERBYVOLdz, 1));
+         outputReducer->addOperator(new DRO::DataReductionOperatorBVOLDerivatives("vg_derivatives/vg_dperbzvoldx",
+                                                                                  bvolderivatives::dPERBZVOLdx, 1));
+         outputReducer->addOperator(new DRO::DataReductionOperatorBVOLDerivatives("vg_derivatives/vg_dperbzvoldy",
+                                                                                  bvolderivatives::dPERBZVOLdy, 1));
+         outputReducer->addOperator(new DRO::DataReductionOperatorBVOLDerivatives("vg_derivatives/vg_dperbzvoldz",
+                                                                                  bvolderivatives::dPERBZVOLdz, 1));
+         outputReducer->addMetadata(outputReducer->size() - 9, "T/m", "$\\mathrm{T}\\,\\mathrm{m}^{-1}$",
+                                    "$\\Delta B_{X,\\mathrm{per,vol,vg}} (\\Delta X)^{-1}$", "1.0");
+         outputReducer->addMetadata(outputReducer->size() - 8, "T/m", "$\\mathrm{T}\\,\\mathrm{m}^{-1}$",
+                                    "$\\Delta B_{X,\\mathrm{per,vol,vg}} (\\Delta Y)^{-1}$", "1.0");
+         outputReducer->addMetadata(outputReducer->size() - 7, "T/m", "$\\mathrm{T}\\,\\mathrm{m}^{-1}$",
+                                    "$\\Delta B_{X,\\mathrm{per,vol,vg}} (\\Delta Z)^{-1}$", "1.0");
+         outputReducer->addMetadata(outputReducer->size() - 6, "T/m", "$\\mathrm{T}\\,\\mathrm{m}^{-1}$",
+                                    "$\\Delta B_{Y,\\mathrm{per,vol,vg}} (\\Delta X)^{-1}$", "1.0");
+         outputReducer->addMetadata(outputReducer->size() - 5, "T/m", "$\\mathrm{T}\\,\\mathrm{m}^{-1}$",
+                                    "$\\Delta B_{Y,\\mathrm{per,vol,vg}} (\\Delta Y)^{-1}$", "1.0");
+         outputReducer->addMetadata(outputReducer->size() - 4, "T/m", "$\\mathrm{T}\\,\\mathrm{m}^{-1}$",
+                                    "$\\Delta B_{Y,\\mathrm{per,vol,vg}} (\\Delta Z)^{-1}$", "1.0");
+         outputReducer->addMetadata(outputReducer->size() - 3, "T/m", "$\\mathrm{T}\\,\\mathrm{m}^{-1}$",
+                                    "$\\Delta B_{Z,\\mathrm{per,vol,vg}} (\\Delta X)^{-1}$", "1.0");
+         outputReducer->addMetadata(outputReducer->size() - 2, "T/m", "$\\mathrm{T}\\,\\mathrm{m}^{-1}$",
+                                    "$\\Delta B_{Z,\\mathrm{per,vol,vg}} (\\Delta Y)^{-1}$", "1.0");
+         outputReducer->addMetadata(outputReducer->size() - 1, "T/m", "$\\mathrm{T}\\,\\mathrm{m}^{-1}$",
+                                    "$\\Delta B_{Z,\\mathrm{per,vol,vg}} (\\Delta Z)^{-1}$", "1.0");


Also this is unnecessary, IMHO

ursg · 2025-09-11T11:11:08Z

datareduction/datareducer.cpp

+                   Real theta = acos(grid.nodes[n].x[2] / sqrt(grid.nodes[n].x[0] * grid.nodes[n].x[0] +
+                                                               grid.nodes[n].x[1] * grid.nodes[n].x[1] +
+                                                               grid.nodes[n].x[2] * grid.nodes[n].x[2])); // Latitude


Here's an exception, that I actually appreciate! :)

ursg · 2025-09-11T11:12:15Z

datareduction/datareducer.cpp

+                      Real deltaE =
+                          (particle_energy[p + 1] - particle_energy[p]) * 1e3 * physicalconstants::CHARGE; // dE in J
+                      numberFlux += grid.nodes[i].parameters[ionosphereParameters::RHON] *
+                                    sqrt(1. / (2. * M_PI * physicalconstants::MASS_ELECTRON)) * particle_energy[p] /
+                                    temp_keV / sqrt(temp_keV * 1e3 * physicalconstants::CHARGE) * deltaE *
+                                    exp(-energyparam); // Flux 1/m^2/s


ursg · 2025-09-11T11:27:41Z

fieldsolver/derivatives.cpp

-            dPsq = std::max((pow(myP[0] - otherP[0], 2) + pow(myP[1] - otherP[1], 2) + pow(myP[2] - otherP[2], 2)) / (2 * myRho * maxU), dPsq);
+            dPsq = std::max((pow(myP[0] - otherP[0], 2) + pow(myP[1] - otherP[1], 2) + pow(myP[2] - otherP[2], 2)) /
+                                (2 * myRho * maxU),
+                            dPsq);


Here's another particularly unpretty formatting example.

ursg · 2025-09-11T11:28:35Z

fieldsolver/fs_common.cpp

+      perturbedResult[Rec::a_yy] =
+          HALF * (der_i2j1k1[fsgrids::dperb::dPERBxdyy] + der_i1j1k1[fsgrids::dperb::dPERBxdyy]);
+      perturbedResult[Rec::a_zz] =
+          HALF * (der_i2j1k1[fsgrids::dperb::dPERBxdzz] + der_i1j1k1[fsgrids::dperb::dPERBxdzz]);
+      perturbedResult[Rec::a_yz] =
+          HALF * (der_i2j1k1[fsgrids::dperb::dPERBxdyz] + der_i1j1k1[fsgrids::dperb::dPERBxdyz]);
+      perturbedResult[Rec::a_xyy] = (der_i2j1k1[fsgrids::dperb::dPERBxdyy] - der_i1j1k1[fsgrids::dperb::dPERBxdyy]);
+      perturbedResult[Rec::a_xyz] = (der_i2j1k1[fsgrids::dperb::dPERBxdyz] - der_i1j1k1[fsgrids::dperb::dPERBxdyz]);
+      perturbedResult[Rec::a_xzz] = (der_i2j1k1[fsgrids::dperb::dPERBxdzz] - der_i1j1k1[fsgrids::dperb::dPERBxdzz]);
+
+      perturbedResult[Rec::b_xx] =
+          HALF * (der_i1j2k1[fsgrids::dperb::dPERBydxx] + der_i1j1k1[fsgrids::dperb::dPERBydxx]);
+      perturbedResult[Rec::b_xz] =
+          HALF * (der_i1j2k1[fsgrids::dperb::dPERBydxz] + der_i1j1k1[fsgrids::dperb::dPERBydxz]);
+      perturbedResult[Rec::b_zz] =
+          HALF * (der_i1j2k1[fsgrids::dperb::dPERBydzz] + der_i1j1k1[fsgrids::dperb::dPERBydzz]);
+      perturbedResult[Rec::b_xxy] = (der_i1j2k1[fsgrids::dperb::dPERBydxx] - der_i1j1k1[fsgrids::dperb::dPERBydxx]);
+      perturbedResult[Rec::b_xyz] = (der_i1j2k1[fsgrids::dperb::dPERBydxz] - der_i1j1k1[fsgrids::dperb::dPERBydxz]);
+      perturbedResult[Rec::b_yzz] = (der_i1j2k1[fsgrids::dperb::dPERBydzz] - der_i1j1k1[fsgrids::dperb::dPERBydzz]);
+
+      perturbedResult[Rec::c_xx] =
+          HALF * (der_i1j1k2[fsgrids::dperb::dPERBzdxx] + der_i1j1k1[fsgrids::dperb::dPERBzdxx]);
+      perturbedResult[Rec::c_xy] =
+          HALF * (der_i1j1k2[fsgrids::dperb::dPERBzdxy] + der_i1j1k1[fsgrids::dperb::dPERBzdxy]);
+      perturbedResult[Rec::c_yy] =
+          HALF * (der_i1j1k2[fsgrids::dperb::dPERBzdyy] + der_i1j1k1[fsgrids::dperb::dPERBzdyy]);
+      perturbedResult[Rec::c_xxz] = (der_i1j1k2[fsgrids::dperb::dPERBzdxx] - der_i1j1k1[fsgrids::dperb::dPERBzdxx]);
+      perturbedResult[Rec::c_xyz] = (der_i1j1k2[fsgrids::dperb::dPERBzdxy] - der_i1j1k1[fsgrids::dperb::dPERBzdxy]);
+      perturbedResult[Rec::c_yyz] = (der_i1j1k2[fsgrids::dperb::dPERBzdyy] - der_i1j1k1[fsgrids::dperb::dPERBzdyy]);


And here, only ~50% of the lines have been broken now.

(This is anyway code that probably nobody will ever read through with any intent or detail... so why isn't this just autogenerated anyway?)

This is inherited directly from Arto's hand-crafted first implementation out of the paper, I/we only changed the arrays, namespaces etc in hiostory but the indices are still as typed by Him in the olden days.

ursg · 2025-09-11T11:32:41Z

fieldsolver/fs_common.h

+bool propagateFields(fsgrids::perbspan perb,
+                     fsgrids::perbspan perbdt2,
+                     fsgrids::efieldspan e,
+                     fsgrids::efieldspan edt2,
+                     fsgrids::ehallspan ehall,
+                     fsgrids::egradpespan egradpe,
+                     fsgrids::egradpespan egradpedt2,
+                     fsgrids::momentsspan moments,
+                     fsgrids::momentsspan momentsdt2,
+                     fsgrids::dperbspan dperb,
+                     fsgrids::dmomentsspan dmoments,
+                     fsgrids::dmomentsspan dmomentsdt2,
+                     fsgrids::bgbspan bgb,
+                     fsgrids::volspan vol,
+                     fsgrids::technicalspan technical, FieldSolverGrid &fsgrid, SysBoundary& sysBoundaries,


Why aren't all of these spans just collected together into one proper "fsgrids" object by now? I mean, it's great that they live in their own namespace, but for all intents and purposes, they might as well be handed back and forth in a single pointer, if they were sharing a common struct.

(Also, the fieldsolver functions could then be methods of that object)

And now I saw further down that FsGridSolverData does exist. So how about passing that one here?

I think so far the idea was to pass the needed spans down the chain, which can transparently point to device or host memory in the end.

IIRC I kept the function signatures on the vlasiator side as close to the original as possible if it wasn't necessary to change them. But definitely one could just pass FieldSolverData here.

I would keep the functionality separate from the FieldSolverData object, i.e. not implement any methods for it. First of all, the FieldSolverData doesn't even own the data, it's just a wrapper over multiple std::spans -- i.e. pointers & lengths -- the original purpose of which was to make the tens of lambda expressions simpler in datareducer.cpp to simplifiy changing the underlying objects, because they can be changed in one place.

Secondly keeping data and functionality separated was one of the major points of the refactoring on the FsGrid side. This way e.g. the initialization of FsData and FieldSolverGrid can happen separately from each other and there are no partially initialized objects: "on this line the MPI initialization has happened for object A, but the data is garbage, but after these lines also the data has been initialized for object A". With separation, FieldSolverGrid is a fully functioning object after construction and likewise for each FsData (albeit they're full of zeroes).

This separation also simplifies e.g. testing of FsData: when it's implemented to work on the GPU, testing it can be done without any references to MPI/OpenMP, since it neither knows nor cares about those.

I see. Ok then, given what we have learned about memory handling in hybrid architectures, it makes a lot of sense to handle the spans individually and carefully.

ursg · 2025-09-11T11:43:05Z

fieldsolver/ldz_hall.cpp

Hooray, my least favourite source file got even messier now. :)

Show it some love, come on! I assume the hyperresistivity one will not be much different. :)

ykempf · 2025-09-11T12:34:12Z

By my eye, most if not all of the lines you disliked in terms of formatting are due to the max character width setting in the config file.

cscjlan added 30 commits February 13, 2025 15:13

Fix debug check

fcee8c0

clang-format the file to ease merging

595c5fb

Take reference to value, not some weird pointer stuff

d127e2b

Combine identical switch arms

005905a

Combine equal if checks; Remove unused parameter

d016b4e

Change pointer to ref and at to operator[]

7778665

clang-format the file to ease merging

3f32dd8

Make constexpr things constexpr

9e6531b

Get technical only once and use it multiple times

fde0f74

Change constrexpr to constexpr static

8b4ed43

Replace repetition with an array and a lambda

0873029

Change variable name; Cache boolean variable; Move repeting computati…

84d9140

…on to a lambda

Move y and z moments computation to lambda; Fix bug where y component…

1a1e382

… of dmoments::dPed was computed regardless of shouldCalculateMoments flag

Move computation to a lambda

8611c31

Add lambda for computing derivative; Add static arrays for perB indic…

384e174

…es; Cache bool check to a variable

Move computation to a lambda

87f0532

Move second derivative computation to a lambda

7de51ab

Compute moments for all components in a loop

1efe849

Move perB computation to a loop

3b6844b

Reduce repetition with a for loop

d3d92d7

Replace FsGrids with spans

ee9d037

Replace fsgrids with spans

04d4713

Replace fsgrids with spans

d53c703

Move lambda to a function; Change vol derivatives to use spans instea…

232143a

…d of fsgrid

Reduce repetition by using a lambda; Replace fsgrids with spans

dfc48a8

clang-format the file to ease merging

6089f25

Replace get with stencil indexing

d0ce5d2

clang-format the file to ease merging

b7fa0ab

Replace fsgrid getting with stencil based span indexing

2059321

clang-format the file to ease merging

89f6c0f

ykempf added 2 commits August 11, 2025 16:17

Removed random perturbation code in magnetic field as annoying for GP…

9eb17a3

…U porting and physically iffy.

Get gridSpacing inside lambdas instead of capturing.

543d01c

ykempf added 2 commits August 12, 2025 14:38

Removed old comments

baa63b1

Cleaned up deleted options from testpackage cfgs

d6fe47c