zhentaoyu

Follow

🎯

Focusing

zhentaoyu zhentaoyu

🎯

Focusing

Follow

6 followers · 17 following

intel
Shanghai
densecollections.top

Achievements

Achievements

Pinned Loading

intel/neural-speed intel/neural-speed Public archive

An innovative library for efficient LLM inference via low-bit quantization

C++ 350 39
intel/intel-extension-for-transformers intel/intel-extension-for-transformers Public archive

⚡ Build your chatbot within minutes on your favorite device; offer SOTA compression techniques for LLMs; run LLMs efficiently on Intel Platforms⚡

Python 2.2k 216
ggml-org/llama.cpp ggml-org/llama.cpp Public

LLM inference in C/C++

C++ 90.5k 13.8k
leejet/stable-diffusion.cpp leejet/stable-diffusion.cpp Public

Diffusion model(SD,Flux,Wan,Qwen Image,...) inference in pure C/C++

C++ 4.6k 448
vllm-fork vllm-fork Public

Forked from HabanaAI/vllm-fork

A high-throughput and memory-efficient inference and serving engine for LLMs

Python
intel/neural-compressor intel/neural-compressor Public

SOTA low-bit LLM quantization (INT8/FP8/MXFP8/INT4/MXFP4/NVFP4) & sparsity; leading model compression techniques on PyTorch, TensorFlow, and ONNX Runtime

Python 2.5k 284