A FastAPI application for transforming questions and SPARQL queries over Wikidata by replacing entities with semantically similar alternatives.
-
Multi-language Support: Handles questions in English, German, French, Russian, and Ukrainian
-
Complexity Levels: Generates variations at different difficulty levels (easy, normal, hard, random)
-
LLM Integration: Uses an LLM to rephrase questions after entity replacement
-
PageRank Scoring: Ranks substitutes based on entity popularity
-
REST API: Exposes
/transformendpoint for batch processing -
Knowledge Graph: Uses (only) the Wikidata Knowledge Graph to find semantically similar entities
-
MongoDB Caching: Caches SPARQL results for performance
Required in .env:
-
MongoDB credentials:
MONGO_HOST,MONGO_USER,MONGO_PASS -
LLM service credentials:
LLM_URL,KEY(empty string for local instance) -
Wikidata SPARQL endpoint:
WIKIDATA_ENDPOINT,WIKIDATA_AGENT(agent string is empty for the basic endpoint)
A sample .env file is provided in the root directory. Copy it to .env and fill in the missing values.
DynBench uses the PageRank of Wikidata URIs (provided on the https://danker.s3.amazonaws.com/index.html page) and the NLTK library. The PageRank file is 19 MB in size.
Requirements are defined in requirements.txt. Install them with:
pip install -r requirements.txtpython3 dynbench.py \
--query "SELECT ?answer WHERE { wd:Q14452 wdt:P17 ?answer }" \
--question "Which country does the famous Easter island belong to?" \
--language en \
--complexity normal \
--model "mistral-small:latest"docker run \
--add-host=host.docker.internal:host-gateway \
-e MONGO_HOST="mongodb://host.docker.internal:27017" \
-e LLM_URL="http://host.docker.internal:11434/api/generate" \
-p 8000:8000 \
dynbenchThereafter, the API will be available at http://localhost:8000.
We are happy to receive your contributions. Please create a pull request or an issue. As this tool is published under the MIT license, feel free to fork it and use it in your own projects.