-
Notifications
You must be signed in to change notification settings - Fork 0
Description
The motivation
As a data curator, I want to run notebooks as IPython notebooks or Python files, using the debugging features of my IDE.
The problem
In the standalone-producer-repo pattern, neither the jupytext notebook in src/ nor the rendered notebooks in dist/ (available after running a book via notebook run) can be run as-is because they won't find their data.
The proposed solution
Stop guessing around so much about where the data is. Relative file paths in yaml files should always be relative to the yaml file, not relative to some arbitrary other location.
Alternatives
Provide a notebook_dir parameter for download_file so that I can explicitly tell bookshelf to stop guessing by providing notebook_dir=".".
Additional context
Rant: papermill was a mistake, and bookshelf is therefore built on mistakes
Maybe I'm overreacting, but I think this points to the wider problem of "too much complexity", which I always suspect whenever people run ipython notebooks non-interactively. Ipython was developed for data vis and running things interactively, and it is marginally better than simple python at it. As soon as you want to run something in a pipeline, IMHO you should just turn it into python functions which can be passed around, debugged, sliced and diced. I spent the better part of a day debugging and setting up something which effectively reads a CSV and writes out a slightly mutated CSV together with some metadata which is read straight from a yaml file.