Skip to content

Use 1/0 labels for binary classification instead of 1/-1 #9

@benmccann

Description

@benmccann

The loss function used in this library for binary classification is a hinge-loss function assuming labels +1 or -1:

case 1 =>
  1 - Math.signum(pred * label)

However, the predictions being made are in the range 0-1:

case 1 =>
  1.0 / (1.0 + Math.exp(-pred))

The 1 / 0 used in predictions should be preferred to the 1 / -1 expected in the loss function because the negative label is represented by 0 in spark.mllib instead of −1, to be consistent with multiclass labeling.

The loss function should be changed to be more like the way Spark does it.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions