Skip to content
Change the repository type filter

All

    Repositories list

    • vllm

      Public
      A high-throughput and memory-efficient inference and serving engine for LLMs
      Python
      10k15018Updated Sep 23, 2025Sep 23, 2025
    • 0102Updated Sep 23, 2025Sep 23, 2025
    • A safetensors extension to efficiently store sparse quantized tensors on disk
      Python
      31162320Updated Sep 23, 2025Sep 23, 2025
    • research

      Public
      Repository to enable research flows
      Python
      0101Updated Sep 22, 2025Sep 22, 2025
    • axolotl

      Public
      Go ahead and axolotl questions
      Python
      1.2k005Updated Sep 21, 2025Sep 21, 2025
    • pytorch

      Public
      Tensors and Dynamic neural networks in Python with strong GPU acceleration
      Python
      25k001Updated Sep 17, 2025Sep 17, 2025
    • Common mixins, registries, and utilities with native support for Pydantic used across popular repos such as GuideLLM and Speculators
      0000Updated Sep 17, 2025Sep 17, 2025
    • 🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
      Python
      30k100Updated Sep 12, 2025Sep 12, 2025
    • Neural Magic GHA
      Python
      0003Updated Sep 3, 2025Sep 3, 2025
    • Perplexity GPU Kernels
      C++
      59000Updated Aug 29, 2025Aug 29, 2025
    • DeepEP

      Public
      DeepEP: an efficient expert-parallel communication library
      Cuda
      934000Updated Aug 29, 2025Aug 29, 2025
    • DeepGEMM

      Public
      DeepGEMM: clean and efficient FP8 GEMM kernels with fine-grained scaling
      Cuda
      697000Updated Aug 29, 2025Aug 29, 2025
    • proving grounds for GitHub to JIRA ... yay!
      0000Updated Aug 27, 2025Aug 27, 2025
    • lmms-eval

      Public
      Accelerating the development of large multimodal models (LMMs) with one-click evaluation module - lmms-eval.
      Python
      378009Updated Aug 6, 2025Aug 6, 2025
    • FlashInfer: Kernel Library for LLM Serving
      Cuda
      514000Updated Jul 18, 2025Jul 18, 2025
    • Arena-Hard-Auto: An automatic LLM benchmark.
      Python
      127001Updated Jul 16, 2025Jul 16, 2025
    • Python
      0000Updated Jul 11, 2025Jul 11, 2025
    • LMCache

      Public
      Redis for LLMs
      Python
      600101Updated Jun 18, 2025Jun 18, 2025
    • Fast and memory-efficient exact attention
      Python
      2k500Updated Jun 11, 2025Jun 11, 2025
    • Pytest plugin used by the Release Engineering team
      Python
      0000Updated Jun 9, 2025Jun 9, 2025
    • A framework for few-shot evaluation of language models.
      Python
      2.7k401Updated Jun 9, 2025Jun 9, 2025
    • yolov5

      Public archive
      YOLOv5 in PyTorch > ONNX > CoreML > TFLite
      Python
      17k1900Updated Jun 4, 2025Jun 4, 2025
    • yolov3

      Public archive
      YOLOv3 in PyTorch > ONNX > CoreML > TFLite
      Python
      3.5k300Updated Jun 4, 2025Jun 4, 2025
    • transformers

      Public archive
      🤗Transformers: State-of-the-art Natural Language Processing for Pytorch and TensorFlow 2.0.
      Python
      30k900Updated Jun 4, 2025Jun 4, 2025
    • llm-d

      Public
      llm-d is a Kubernetes-native high-performance distributed LLM inference framework
      Makefile
      164000Updated Jun 3, 2025Jun 3, 2025
    • deepsparse

      Public archive
      Sparsity-aware deep learning inference runtime for CPUs
      Python
      1903.2k10Updated Jun 2, 2025Jun 2, 2025
    • sparsify

      Public archive
      ML model optimization product to accelerate inference.
      Python
      3032610Updated Jun 2, 2025Jun 2, 2025
    • sparseml

      Public archive
      Libraries for applying sparsification recipes to neural networks with a few lines of code, enabling faster and smaller models
      Python
      1572.1k10Updated Jun 2, 2025Jun 2, 2025
    • docs

      Public archive
      Top-level directory for documentation and general content
      MDX
      712000Updated Jun 2, 2025Jun 2, 2025
    • sparsezoo

      Public archive
      Neural network model repository for highly sparse and sparse-quantized models with matching sparsification recipes
      Python
      2739210Updated Jun 2, 2025Jun 2, 2025