Local BGE Embeddings API

Use the BAAI/bge-small-zh-v1.5 model locally through an OpenAI-compatible /v1/embeddings endpoint powered by FastAPI.

1. Environment Setup

python -m venv .venv
source .venv/bin/activate
pip install --upgrade pip
pip install -r requirements.txt

2. Configure Hugging Face Mirror & Cache

The service defaults to the Hugging Face China mirror (https://hf-mirror.com). Override it before the first run if you prefer a different mirror.

export HF_ENDPOINT=https://hf-mirror.com         # or your preferred mirror URL
export EMBEDDING_CACHE_DIR=$(pwd)/model_cache    # persistent local cache

You can pre-download the model once (optional but recommended):

python - <<'PY'
from sentence_transformers import SentenceTransformer
SentenceTransformer("BAAI/bge-small-zh-v1.5", cache_folder="$EMBEDDING_CACHE_DIR", device="cpu")
PY

3. Run the API Service

uvicorn app.main:app --host 0.0.0.0 --port 8000 --reload

FastAPI will serve interactive docs at http://localhost:8000/docs.

4. OpenAI-Compatible Request Example

curl -X POST "http://localhost:8000/v1/embeddings" \
  -H "Content-Type: application/json" \
  -d '{
        "model": "bge-small-zh-v1.5",
        "input": ["今天天气很好", "自然语言处理"],
        "user": "demo-user"
      }'

Response excerpt:

{
  "object": "list",
  "data": [
    {"object": "embedding", "index": 0, "embedding": [...]},
    {"object": "embedding", "index": 1, "embedding": [...]} 
  ],
  "model": "bge-small-zh-v1.5",
  "usage": {"prompt_tokens": 9, "total_tokens": 9}
}

5. Configuration Reference

EMBEDDING_MODEL_NAME: switch to a different SentenceTransformer checkpoint.
EMBEDDING_DEVICE: set to cuda, mps, etc. Defaults to CPU.
EMBEDDING_BATCH_SIZE: control batch size for encode().
EMBEDDING_CACHE_DIR: persistent model/cache directory (also reused for Hugging Face cache when provided).

⚠️ Token usage in the response is a simple heuristic (character count based). Integrate your tokenizer if you require exact counts.

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
app		app
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Local BGE Embeddings API

1. Environment Setup

2. Configure Hugging Face Mirror & Cache

3. Run the API Service

4. OpenAI-Compatible Request Example

5. Configuration Reference

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Local BGE Embeddings API

1. Environment Setup

2. Configure Hugging Face Mirror & Cache

3. Run the API Service

4. OpenAI-Compatible Request Example

5. Configuration Reference

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages