In order to tune thresholds, we need to be able to see how confident the classifier was in determining whether statements were on topic.