Skip to content

CSCfi/metadata-submitter

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 

Repository files navigation

SD Submit API

uv Ruff pre-commit

SD Submit API supports sensitive data submissions.

The SD Submit UI is implemented here: metadata-submitter-frontend.

SD Submit API integrates with the following external services:

flowchart LR
    subgraph SD Submit
        UI(SD Submit UI) --> API(SD Submit API)
    end
    subgraph External services
        direction LR
        API -->|All database operations| Postgres(PostgreSQL DB)
        API -->|User project listing| LDAP(LDAP Service)
        API -->|EC2 credential handling| Keystone(Keystone) --> OS(Object storage)
        API -->|Bucket/object operations| Allas(S3 API) --> OS
        API -->|DOI creation for Bigpicture| DataCiteAPI(DataCite API) --> DataCite
        API -->|DOI creation for SD| PID(PID API) --> DataCite(DataCite)
        API -->|Discovery metadata mapping| Metax(Metax API) --> Fairdata-Etsin(FairData Etsin)
        API -->|Dataset access registeration| REMS(REMS API) --> Apply(SD Apply/REMS)
        API -->|Bigpicture file ingestion| Admin(Admin API) --> SDA(NEIC SDA)
        API -->|Bigpicture metadata| Beacon(Imaging Beacon API) --> BP-Discovery(Bigpicture Discovery)
    end
Loading

๐Ÿ’ป Development

Click to expand

Prerequisites

Git LFS is required to checkout the metadata_backend/conf/taxonomy_files/names.json file. Without Git LFS, this file can also be generated from NCBI taxonomy using the following command:

scripts/taxonomy/generate_name_taxonomy.sh

Initialise the project for development and testing

Clone the repository and go to the project directory:

git clone
cd metadata-submitter

Install uv, tox and pre-commit tools for development:

curl -LsSf https://astral.sh/uv/install.sh | sh
uv tool install tox --with tox-uv
uv tool install pre-commit --with pre-commit-uv
pre-commit install

The project is managed by uv (Python package and project manager), which creates a virtual Python environment in .venv directory using the version defined in the .python-version. uv installs the dependencies in uv.lock file that are specified in the pyproject.toml file.

Whenever initiating a local development environment, it is a good idea to run the following commands:

uv self update  # update uv
uv sync --dev  # sync all python dependencies
source .venv/bin/activate  # activate the uv venv

Configure environment variables

Copy the contents of .env.example file to .env file and edit it as needed:

cp .env.example .env

Additionally, secrets for live services can be inserted into the .env file automatically with:

export VAULT_ADDR=  # Define URL address for a Vault instance
make get_env  # This will prompt a login in the web browser

Run the web service and database locally

Launch both server and database with Docker by running: docker compose up --build (add -d flag to the command to run containers in the background).

Server can then be found from http://localhost:5430.

If you also need to initiate the graphical UI for developing the API, check out metadata-submitter-frontend repository and follow its development instructions. You will then also need to set the REDIRECT_URL environment variable to the UI address (e.g. add REDIRECT_URL=http://localhost:3000 into the .env file) and relaunch the development environment as specified above.

Alternatively, there is a more convenient method for developing the SD Submit API via a Python virtual environment using a Procfile, which is described here below.

Developing with Python virtual environment

Use uv to create and activate the virtual environment for development and testing as instructed above. Then follow these instructions:

# Optional: update references for metax integration
scripts/metax_mappings/fetch_refs.sh

# Optional: update taxonomy names for taxonomy search endpoint
# However, this is a NECESSARY step if you have not installed Git LFS
scripts/taxonomy/generate_name_taxonomy.sh

Then copy .env file and set up the environment variables. The example file has hostnames for development with Docker network (via docker compose). You will have to change the hostnames to localhost.

cp .env.example .env  # Make any changes you need to the file

Secrets, which are used for testing against other services are fetched from Vault and added to the .env file with the following:

export VAULT_ADDR=  # Add correct URL here
make get_env  # This will prompt a login in the web browser

Finally, start the servers with code reloading enabled, so any code changes restarts the servers automatically:

honcho start

The development server should now be accessible at localhost:5430. If it doesn't work right away, check your settings in .env and restart the servers manually if you make changes to .env file.

OpenAPI Specification docs with Swagger

Swagger UI for viewing the API specs is already available in the production docker image. During development, you can enable it by executing: bash scripts/swagger/generate.sh.

Restart the server, and the swagger docs will be available at http://localhost:5430/swagger.

Swagger docs requirements:

  • bash
  • Python 3.14+
  • PyYaml (installed via the development dependencies)
  • realpath (default Linux terminal command)

Keeping Python requirements up to date

The project Python package dependencies are automatically being kept up to date with renovatebot.

Dependencies are added and removed to the project using the uv commands or by directly editing the pyproject.toml file. In the latter case run uv sync or uv sync --dev to update the uv.lock file.

๐Ÿ› ๏ธ Contributing

Click to expand

Development team members should check internal contributing guidelines for Gitlab.

If you are not part of CSC and our development team, your help is nevertheless very welcome. Please see contributing guidelines for Github.

๐Ÿงช Testing

Click to expand

For local testing, see above instructions for installing uv to create a virtual environment.

Install all testing dependencies in the virtual environment by running the following commands:

uv pip install ".[verify,test,docs]"
source .venv/bin/activate  # activate uv virtual env

A pre-commit hook will execute all linting and unit tests before each commit. All tests are also run in the Gitlab CI/CD pipeline for every merge request.

Linting and Unit tests

Majority of the automated tests (such as unit tests, code style checks etc.) can be run with tox automation with the following command:

tox -p auto

Integration tests

Integration tests are run separately with pytest after the containerized testing environment has been set up.

Integration tests can be run using mock services:

docker compose --env-file tests/integration/.env.example --profile dev up --build -d
pytest tests/integration

Or using external services configured with secrets:

make get_env
docker compose --env-file tests/integration/.env --profile dev up --build -d
pytest tests/integration

๐Ÿš€ Deployment

Click to expand

Production version can be built and run with following docker commands:

docker build --no-cache -f dockerfiles/Dockerfile -t cscfi/metadata-submitter .
docker run -p 5430:5430 cscfi/metadata-submitter

The frontend is built and added as static files to the backend deployment with this method.

Helm charts for a kubernetes cluster deployment will also be available soonโ„ข๏ธ.

๐Ÿ“œ License

Click to expand

Metadata submission interface is released under MIT, see LICENSE.

About

Metadata Submission Interface for SDA

Topics

Resources

License

Contributing

Stars

Watchers

Forks

Contributors 16

Languages