Popular repositories Loading
-
chronos-forecasting
chronos-forecasting PublicChronos: Pretrained Models for Probabilistic Time Series Forecasting
-
-
Repositories
Showing 10 of 386 repositories
- SWE-PolyBench Public
SWE-PolyBench: A multi-language benchmark for repository level evaluation of coding agents
amazon-science/SWE-PolyBench’s past year of commit activity - carbon-assessment-with-ml Public
CaML: Carbon Footprinting of Household Products with Zero-Shot Semantic Text Similarity
amazon-science/carbon-assessment-with-ml’s past year of commit activity - TurboFuzzLLM Public
TurboFuzzLLM: Turbocharging Mutation-based Fuzzing for Effectively Jailbreaking Large Language Models in Practice
amazon-science/TurboFuzzLLM’s past year of commit activity - MigrationBench Public
amazon-science/MigrationBench’s past year of commit activity - wraval Public
WRAVAL helps in evaluating LLMs for writing assistant tasks like summarization, professional tone, witty tone, etc.
amazon-science/wraval’s past year of commit activity - SDFeedback Public
amazon-science/SDFeedback’s past year of commit activity - MEMERAG Public
MEMERAG: A Multilingual End-to-End Meta-Evaluation Benchmark for Retrieval Augmented Generation
amazon-science/MEMERAG’s past year of commit activity
Most used topics
Loading…