Skip to content

Commit ebcdd26

Browse files
authored
Merge pull request #31 from samhorsfield96/docker
Add docker to CI
2 parents ab252d6 + 5778152 commit ebcdd26

File tree

9 files changed

+127
-28
lines changed

9 files changed

+127
-28
lines changed

.github/workflows/docker_push.yml

Lines changed: 47 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,47 @@
1+
name: Build and push Docker image
2+
3+
on:
4+
push:
5+
branches:
6+
- 'master'
7+
- 'docker'
8+
tags:
9+
- 'v*'
10+
pull_request:
11+
branches:
12+
- 'master'
13+
create:
14+
tags:
15+
- v*
16+
17+
jobs:
18+
docker-upload:
19+
runs-on: ubuntu-latest
20+
steps:
21+
- name: Checkout code
22+
uses: actions/checkout@v2
23+
- name: Docker meta
24+
id: meta
25+
uses: docker/metadata-action@v4
26+
with:
27+
images: samhorsfield96/ggcaller
28+
- name: Set up QEMU
29+
uses: docker/setup-qemu-action@v1
30+
- name: Set up Docker Buildx
31+
uses: docker/setup-buildx-action@v1
32+
- name: Login to DockerHub
33+
uses: docker/login-action@v3
34+
with:
35+
username: ${{ secrets.DOCKER_REGISTRY_USERNAME }}
36+
password: ${{ secrets.DOCKER_REGISTRY_PASSWORD }}
37+
- name: Build and push
38+
id: docker_build
39+
uses: docker/build-push-action@v3
40+
with:
41+
push: ${{ github.event_name != 'pull_request' }}
42+
tags: ${{ steps.meta.outputs.tags }}
43+
labels: ${{ steps.meta.outputs.labels }}
44+
file: docker/Dockerfile
45+
provenance: false
46+
- name: Image digest
47+
run: echo ${{ steps.docker_build.outputs.digest }}

docker/Dockerfile

Lines changed: 21 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -4,11 +4,27 @@ USER root
44

55
# create a project directory inside user home
66
ARG MAMBA_DOCKERFILE_ACTIVATE=1
7-
COPY . /src
8-
WORKDIR /src
97

8+
# create a project directory inside user home
9+
# (this isn't used with a clone running snakemake)
10+
ENV PROJECT_DIR $HOME/app
11+
RUN mkdir $PROJECT_DIR
12+
# copy the code in
13+
COPY . $PROJECT_DIR
14+
WORKDIR $PROJECT_DIR
15+
16+
# build conda env
17+
ENV ENV_PREFIX $PROJECT_DIR/env
1018
COPY --chown=$user:$user docker/environment_docker.yml /tmp/environment_docker.yml
19+
20+
COPY --chown=$user:$user docker/entrypoint.sh /usr/local/bin/
21+
RUN chmod u+x /usr/local/bin/entrypoint.sh
22+
1123
RUN micromamba install -y -n base -f /tmp/environment_docker.yml && \
12-
micromamba clean --all --yes && python -m pip install --no-deps --ignore-installed . \
13-
&& PATH=$PATH:/opt/conda/bin
14-
WORKDIR /workdir
24+
micromamba clean --all --yes && \
25+
python -m pip install --no-deps --ignore-installed . && \
26+
mkdir ggc_db && \
27+
ggcaller --balrog-db ggc_db && \
28+
PATH=$PATH:/opt/conda/bin
29+
30+
ENTRYPOINT [ "/usr/local/bin/entrypoint.sh" ]

docker/entrypoint.sh

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,4 @@
1+
#!/bin/bash --login
2+
set -e
3+
4+
exec "$@"

docs/installation.rst

Lines changed: 5 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -17,11 +17,11 @@ First, install `Docker <https://docs.docker.com/get-docker/>`_ for your OS. If r
1717

1818
To use the latest image, run::
1919

20-
docker pull samhorsfield96/ggcaller:latest
20+
docker pull samhorsfield96/ggcaller:master
2121

2222
To run ggCaller from the Docker Hub image, run::
2323

24-
cd test && docker run --rm -it -v $(pwd):/workdir samhorsfield96/ggcaller:latest ggcaller --refs pneumo_CL_group2.txt
24+
cd test && docker run --rm -it -v $(pwd):/workdir -v $(pwd):/data samhorsfield96/ggcaller:master ggcaller --balrog-db /app/ggc_db --refs /workdir/pneumo_CL_group2_docker.txt --out /workdir/ggc_out
2525

2626
You can also build the image yourself. First download and switch to the ggCaller repository::
2727

@@ -33,7 +33,9 @@ Finally, build with Docker. This should take between 5-10 minutes to fully insta
3333

3434
To run ggCaller from a local Docker build, run::
3535

36-
cd test && docker run --rm -it -v $(pwd):/workdir ggc_env:latest ggcaller --refs pneumo_CL_group2.txt
36+
cd test && docker run --rm -it -v $(pwd):/workdir -v $(pwd):/data ggc_env:latest ggcaller --balrog-db /app/ggc_db --refs /workdir/pneumo_CL_group2_docker.txt --out /workdir/ggc_out
37+
38+
Please ensure you keep ``--balrog-db /app/ggc_db`` and ``/workdir`` paths as specified above.
3739

3840
Installing with singularity
3941
-----------------------------------

docs/quickstart.rst

Lines changed: 8 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -15,13 +15,17 @@ The easiest way to get up and running is using Docker. To get up and running, pu
1515
Preparing the data
1616
------------------
1717

18-
Place all of you samples to analyse in the same directory. Then navigate inside and run::
18+
Place all of your samples to be analysed in the same directory. Then navigate inside and run::
1919

2020
ls -d -1 $PWD/*.fasta > input.txt
2121

2222
If using Docker, instead navigate to the directory containing the fasta files and run the below command, to ensure file paths are relative (the docker version will not work with absolute paths)::
2323

24-
ls -d -1 *.fasta > input.txt
24+
ls -d -1 *.fasta > input_docker.txt
25+
26+
Then, append the prefix ``/data/`` to each line to enable ggCaller to find the files::
27+
28+
sed -i -e 's|^|/data/|' input_docker.txt
2529

2630
Running ggCaller
2731
------------------
@@ -39,9 +43,9 @@ To run ggCaller with just reads::
3943

4044
ggcaller --reads input.txt --out output_path
4145

42-
If using Docker, run with the below command. You must ensure all paths are relative, including in ``input.txt``::
46+
If using Docker, run with the below command, ensuring you keep ``--balrog-db /app/ggc_db`` and ``/workdir`` paths as specified below. Replace ``path to files`` with the absolute path to the directory of files in ``input_docker.txt``::
4347

44-
docker run --rm -it -v $(pwd):/workdir samhorsfield96/ggcaller:latest ggcaller --refs input.txt --out output_path
48+
docker run --rm -it -v $(pwd):/workdir -v <path to files>:/data samhorsfield96/ggcaller:master ggcaller --balrog-db /app/ggc_db --refs /workdir/input_docker.txt --out /workdir/output_path
4549

4650
.. important::
4751
We haven't extensively tested calling genes within

ggCaller/__init__.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,4 +2,4 @@
22

33
'''ggCaller: a gene caller for Bifrost graphs'''
44

5-
__version__ = '1.3.4'
5+
__version__ = '1.3.5'

ggCaller/__main__.py

Lines changed: 13 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -326,6 +326,12 @@ def get_options():
326326

327327
# Other options
328328
Misc = parser.add_argument_group('Misc. options')
329+
Misc.add_argument("--balrog-db",
330+
default=None,
331+
dest="balrog_db",
332+
help="Path to save BALROG and default annotation databases. If not specified will download"
333+
"automatically on first run."
334+
"[Default = None]")
329335
Misc.add_argument("--quiet",
330336
dest="verbose",
331337
help="suppress additional output"
@@ -420,20 +426,24 @@ def main():
420426
options.reads is not None) and (options.query is None):
421427
graph_tuple = graph.build(options.refs, options.kmer, stop_codons_for, stop_codons_rev, start_codons_for,
422428
start_codons_rev, options.threads, False, options.no_write_graph, options.reads, ref_set)
429+
elif options.balrog_db is not None:
430+
db_dir = download_db(options.balrog_db)
431+
sys.exit(0)
423432
else:
424433
print("Error: incorrect number of input files specified. Please only specify the below combinations:\n"
425434
"- Bifrost GFA and Bifrost colours file (with/without list of reference files)\n"
426435
"- Bifrost GFA, Bifrost colours file and list of query sequences\n"
427436
"- List of reference files\n"
428437
"- List of read files\n"
429-
"- A list of reference files and a list of read files.")
438+
"- A list of reference files and a list of read files.\n"
439+
"- A path to download the balrog gene model files.")
430440
sys.exit(1)
431441

432442
# unpack ORF pair into overlap dictionary and list for gene scoring
433443
input_colours, nb_colours, overlap, ref_list = graph_tuple
434444

435445
# download balrog and annotation files
436-
db_dir = download_db()
446+
db_dir = download_db(options.balrog_db)
437447

438448
# set rest of panaroo arguments
439449
options = set_default_args(options, nb_colours)
@@ -486,7 +496,7 @@ def main():
486496
# load models models if required
487497
if not options.no_filter:
488498
print("Loading gene models...")
489-
ORF_model_file, TIS_model_file = load_balrog_models()
499+
ORF_model_file, TIS_model_file = load_balrog_models(db_dir)
490500

491501
else:
492502
ORF_model_file, TIS_model_file = "NA", "NA"

models/__main__.py

Lines changed: 23 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -4,28 +4,39 @@
44

55
""" Get directories for model and seengenes """
66
module_dir = os.path.dirname(os.path.realpath(__file__))
7-
zipped_db_dir = db_dir = os.path.join(module_dir, "ggCallerdb.tar.bz2")
8-
db_dir = os.path.join(module_dir, "ggCallerdb")
9-
balrog_model_dir = os.path.join(db_dir, "balrog_models")
7+
module_zipped_db_dir = os.path.join(module_dir, "ggCallerdb.tar.bz2")
8+
module_db_dir = os.path.join(module_dir, "ggCallerdb")
9+
module_balrog_model_dir = os.path.join(module_db_dir, "balrog_models")
1010

11-
def download_db():
12-
if not os.path.exists(zipped_db_dir):
11+
def download_db(download_db=None):
12+
if download_db is None:
13+
zipped_db_path = module_zipped_db_dir
14+
db_path = module_db_dir
15+
output_dir = module_dir
16+
else:
17+
zipped_db_path = os.path.join(download_db, "ggCallerdb.tar.bz2")
18+
db_path = os.path.join(download_db, "ggCallerdb")
19+
output_dir = download_db
20+
21+
if not os.path.exists(zipped_db_path):
1322
print("Downloading databases...")
1423
url = "https://ftp.ebi.ac.uk/pub/databases/pp_dbs/ggCallerdb.tar.bz2"
15-
filename = wget.download(url, out=module_dir)
24+
filename = wget.download(url, out=output_dir)
1625
print("")
17-
if not os.path.exists(db_dir):
18-
tar = tarfile.open(db_dir + ".tar.bz2", mode="r:bz2")
19-
tar.extractall(module_dir)
26+
if not os.path.exists(db_path):
27+
tar = tarfile.open(zipped_db_path, mode="r:bz2")
28+
tar.extractall(output_dir)
2029
tar.close()
2130

22-
return db_dir
31+
return db_path
2332

24-
def load_balrog_models():
33+
def load_balrog_models(db_path):
34+
balrog_model_dir = os.path.join(db_path, "balrog_models")
35+
2536
# check if directory exists. If not, unzip file
2637
if not os.path.exists(balrog_model_dir):
2738
tar = tarfile.open(balrog_model_dir + ".tar.gz", mode="r:gz")
28-
tar.extractall(db_dir)
39+
tar.extractall(db_path)
2940
tar.close()
3041

3142
geneTCN = os.path.join(balrog_model_dir, "geneTCN_jit.pt")

test/pneumo_CL_group2_docker.txt

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
/data/CR931658_Streptococcus_pneumoniae_strain_559_66_serotype_12a.fa
2+
/data/CR931659_Streptococcus_pneumoniae_strain_Gambia_1_81_serotype_12b.fa
3+
/data/CR931660_Streptococcus_pneumoniae_strain_6312_serotype_12f.fa
4+
/data/CR931717_Streptococcus_pneumoniae_strain_Hammer_serotype_44.fa
5+
/data/CR931719_Streptococcus_pneumoniae_strain_Eddy_nr._73_serotype_46.fa

0 commit comments

Comments
 (0)