Six sigma rule generator is a pyspark tool to generate six sigma rules for columns.
Background: https://www.isixsigma.com/tools-templates/control-charts/a-guide-to-control-charts/
The rule generator expects the target DataFrame to have a timestamp column.
pip install -e .python setup.py bdist- Navigate to
Clusters/[your cluster]/Librariespage: - Click
Install Newbutton - Select
Python EggfromLibrary Typetab - Drag&drop the generated .egg file from the cloned repository's
distdirectory to the window - Click
Installbutton
from wilson import SixSigma
df = spark.read.csv('example.csv')
sixsigma = SixSigma(timecol='timestamp')
df = sixsigma.apply(df, ['target_column_1'])
df.show()