Skip to content

Make debugging easier #116

@mikapfl

Description

@mikapfl

The motivation

As a data curator, I want to run notebooks as IPython notebooks or Python files, using the debugging features of my IDE.

The problem

In the standalone-producer-repo pattern, neither the jupytext notebook in src/ nor the rendered notebooks in dist/ (available after running a book via notebook run) can be run as-is because they won't find their data.

The proposed solution

Stop guessing around so much about where the data is. Relative file paths in yaml files should always be relative to the yaml file, not relative to some arbitrary other location.

Alternatives

Provide a notebook_dir parameter for download_file so that I can explicitly tell bookshelf to stop guessing by providing notebook_dir=".".

Additional context

Rant: papermill was a mistake, and bookshelf is therefore built on mistakes

Maybe I'm overreacting, but I think this points to the wider problem of "too much complexity", which I always suspect whenever people run ipython notebooks non-interactively. Ipython was developed for data vis and running things interactively, and it is marginally better than simple python at it. As soon as you want to run something in a pipeline, IMHO you should just turn it into python functions which can be passed around, debugged, sliced and diced. I spent the better part of a day debugging and setting up something which effectively reads a CSV and writes out a slightly mutated CSV together with some metadata which is read straight from a yaml file.

Metadata

Metadata

Assignees

No one assigned

    Labels

    featureRelated to a (new) feature

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions