Skip to content

Make hdf5_io more flexible, reusable and less verbose #4

@jacg

Description

@jacg

There are a few aspects to this. Their solutions are probably related.

Reduce DRY violations

Take

typedef struct {
  unsigned int event_id;
  double x;
  double y;
  double z;
  double t;
} hit_t;

This must be accompanied by the 100% deducible-from-the-above boilerplate

HF::CompoundType create_hit_type() {
  return {{"event_id", HF::AtomicType<unsigned int>{}},
          {"x", HF::AtomicType<double>{}},
          {"y", HF::AtomicType<double>{}},
          {"z", HF::AtomicType<double>{}},
          {"t", HF::AtomicType<double>{}}};
}
HIGHFIVE_REGISTER_TYPE(hit_t, create_hit_type)

This should be generated automatically by machine rather than typed out by humans.

Make hdf5_io generic, reusable and C++-friendly.

The interface we should be aiming for, looks very roughly like this:

class my_first { public: /*whatever I want*/ private: int foo; double bar; };
class my_other { ... double whatever; T something; };

using my_schema = hdf5_io<"A_Group", {"dataset1", my_first}, {"dataset2", my_other}>;

my_schema my_table;
my_table.open();

my_table.write<"dataset1">(my_first{1, 2.3});
my_table.write<"dataset1">(vector<my_first>{ ... });

my_table.write<"dataset2">(my_other{4.5, T{...}});
my_table.write<"dataset2">(vector<my_other>{ ... });

my_first = my_table.read_one<"dataset1">();
vector<my_first> = my_table.read<"dataset1">();

my_other = my_table.read_one<"dataset2">();
vector<my_other> = my_table.read<"dataset2">();

That is,

  • We should define the functionality once, and make it generic over arbitrary
    groups containing arbitrary datasets.

  • Let the user write and manipulate C++ classes: we should take care, behind the
    scenes, of creation of any C-structs that may be needed in order to talk to
    HDF5. (I'm not 100% convinced that the C++ classes can't be used directly and
    that C-structs are needed, but if they are, the user shouldn't notice.)

The biggest problem will probably be finding a usable syntax for the
compile-time specification (so, probably template arguments) of the schema and a
way to do the necessary metaprogramming without reducing the Bus Number to zero
(and increasing compile times to infinity).

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions