Add kp-detect-v23 submission#157
Conversation
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
Eval run succeeded! Link to run: link Here are the results of the submission(s): Kareem Elsamadicy (Independent Researcher)Release date: 2026-05-23 I've committed detailed results of this detector's performance on the test set to this PR. On the RAID dataset as a whole (aggregated across all generation models, domains, decoding strategies, repetition penalties, and adversarial attacks), it achieved an AUROC of 93.37 and a TPR of 89.76% at FPR=5% and 79.46% at FPR=1%. If all looks well, a maintainer will come by soon to merge this PR and your entry/entries will appear on the leaderboard. If you need to make any changes, feel free to push new commits to this PR. Thanks for submitting to RAID! |
Submission
Detector: kp-detect-v23
Author: Kareem Elsamadicy (Independent Researcher)
Contact: kelsamadicy@gmail.com
Ensemble of a transformer-based semantic classifier (DeBERTa-v3-base, 768-D mean-pooled embeddings + logistic regression) and an attack-feature gradient boosting detector (41-D engineered features, 5-seed ensemble, global temperature calibration). Routing by predicted attack type: semantic classifier weighted more heavily for paraphrase attacks, feature-based detector used alone for character-manipulation attacks.