Skip to content

The dataset contains Wikipedia comments which have been labeled by human raters for toxic behavior.

Notifications You must be signed in to change notification settings

TheSarang/Toxic-comments-classification

Repository files navigation

Toxic comments classification

This dataset comprise of Twitter comments which have been labeled by human raters for toxic behavior. The types of toxicity are:

  • toxic
  • severe_toxic
  • obscene
  • threat
  • insult
  • identity_hate

File descriptions

  • train.csv: the training set, contains comments with their binary labels.
  • test.csv: the test set, you must predict the toxicity probabilities for these comments. To deter hand labeling, the test set contains some comments which are not included in scoring.
  • sample_submission.csv: a sample submission file in the correct format.
  • test_labels.csv: labels for the test data; value of -1 indicates it was not used for scoring.

About

The dataset contains Wikipedia comments which have been labeled by human raters for toxic behavior.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published