Skip to content
Discussion options

You must be logged in to vote

1、Recommended Choices for MCP Scan

  • GLM4.5
  • DeepSeek-V3.1
  • Kimi-K2-Instruct
  • Qwen3-Coder-480B
  • Hunyuan-Turbos

2、Recommended Choices for Jailbreak Evaluation Models

When working with a custom dataset, selecting an appropriate safety evaluation model can significantly improve the accuracy of automated assessments. You can balance model selection from two dimensions: language and scenario.

Language

  • Chinese Recommendation:

    • qwen3-max (best performance)
    • qwen3-235b-a22b-2507 (cost-effective choice)
  • English Recommendation:

    • claude-opus-4.1 (best performance)
    • claude-sonnet-4 (very good performance)
    • gemini-2.0-flash (cost-effective choice)

Scenario

  • Politically sensitive content testing:
    Do not

Replies: 1 comment

Comment options

You must be logged in to vote
0 replies
Answer selected by zonashi
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
2 participants