Skip to content

dferhod/open_gdb

 
 

Repository files navigation

Open GDB

Implementation of graph database including user and repository management based on RDF4J.

Architecture:

The service consists of:

  1. nginx: reverse proxy that handles all incoming traffic
  2. authproxy: Django based authentication proxy for RDF4J that handles user and repository management
  3. RDF4J: Serves as the triplestore backend
  4. outproxy: An outgoing proxy that blocks all outgoing requests into the local network. RDF4J uses this proxy for outgoing traffic to prevent users from accessing every RDF4J repository using SPARQL SERVICE strings that target localhost.

Deployment

  1. Rename .env.sample to .env and replace in the variables to your liking.
  2. Deploy using docker compose up --build -d

Usage:

Django Admin: Once the docker compose stack is deployed the Django admin interface is available on localhost/admin. You can login with the credentials you set int the .env.

In case you want to debug RDF4J there's docker-compose.debug.yml, which runs a second RDF4J on port 9999. This only serves the RDF4J workbench (access via localhost:9999/rdf4j-workbench).

Some context: The RDF4J container serves both the workbench (/rdf4j-workbench) and the actual server (/rdf4j-server). Whenever the workbench has to talk to the server this is done via HTTP over the local network. Unfortunately it is not possible to set the outgoing proxy only for the server. This leads to requests from the workbench also being caught/denied by the outgoing proxy. A temporary solution is running the second RDF4J server that does not use the outgoing proxy. To see what's going on in the actual RDF4J server you have to set the RDF4J-Server-URL to http://rdf4j:8080/rdf4j-server in the workbench interface of the second RDF4J server.

APIs:

  • RDF4J Routes are available (Graph Store, Transactions, and Protocol are not implemented yet)
  • GraphDB Routes for repository and user management are also available
    • User management: /rest/security/users
    • Repository management: /rest/repositories/

Token authentication:

DISCLAIMER: Only use the token authentication if you are deploying open_gdb over HTTPS!

Since Django by default uses strong password hashing functions, authenticating Users in every API request takes a lot of time. (slows down requests by a factor of 30-50) This is especially noticeable when you send a lot of API requests at once that are all authenticated via BasicAuth.

For this reason the authproxy has the option to use Token Authentication. The endpoint for getting a token is /api-token-auth/. Send a post request to this endpoint containing the credentials encoded in JSON format.

curl example

$ curl -X POST https://ts.my-domain.com/api-token-auth/ -d username=MY_USERNAME -d password=MY_PASSWORD
{"token": "SOME_TOKEN"}
# Now you can access the API like this
$ curl -X GET https://ts.my-domain.com/repositories -H 'Authorization: Token SOME_TOKEN'

Python example

import requests

data = {
  "username": "MY_USERNAME",
  "password": "MY_PASSWORD"
}
token_response = requests.post("https://ts.my-domain.com/api-token-auth/", data=data)
token = token_response.json()['token']

# Now you can access the API like this
headers = { "Authorization": f"Token {token}"}
api_response = requests.get("https://ts.my-domain.com/repositories", headers=headers)

Changing service names

In case you need to change the service/container names for the docker-compose project for whatever reason, you have to change the following:

  • rdf4j:
    • Adjust depends_on and the RDF4J_HOSTNAME env variable in nginx and authproxy services
    • Adjust depends_on in outproxy
  • nginx: Can just be renamed
  • authproxy:
    • Adjust depends_on and the AUTHPROXY_HOSTNAME env variable in nginx service
  • outproxy:
    • Adjust the -Dhttp.proxyHost and -Dhttps.proxyHost flags in the JAVA_OPTS in rdf4j service
    • Adjust depends_on in nginx service

Future development:

  • Ideally we want to make this a drop in replacement for the GraphBB server that also works with the standalone GraphDB Workbench

About

An RDF4J backend that includes repo-management and authentication

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages

  • Python 87.3%
  • Go 6.8%
  • Dockerfile 3.0%
  • HTML 2.5%
  • Shell 0.4%