Requires: jupyter, pandas, and matplotlib
The data/ directory can be provided on request
When using Swan, when asked to set up a configuration, under the spark cluster option, choose Analytix.
To create the data/ directory, run the runall.sh script.
Before running the script you may need to initialize a Kerebos token for Hadoop by the command kinit afsusername, replacing afsusername with your username.