Skip to content

Commit 2550a59

Browse files
Update rclone setup for Docker (#803)
* Update rclone setup for Docker * Added explanation of placeholders * Update README * Added note that MinIO is the default bucket to use
1 parent 30f432d commit 2550a59

File tree

6 files changed

+67
-59
lines changed

6 files changed

+67
-59
lines changed

.devcontainer/Dockerfile

Lines changed: 0 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -39,10 +39,6 @@ RUN poetry config virtualenvs.create true && \
3939
poetry config virtualenvs.in-project true
4040
# Clean up
4141
RUN rm -rf /var/lib/apt/lists/*
42-
# Set up the MinIO/Backblaze bucket
43-
RUN mkdir -p ~/M
44-
RUN mkdir -p ~/B
45-
RUN mkdir -p ~/.config/rclone
4642
# Set environment variables
4743
ENV CLEARML_API_HOST="https://api.sil.hosted.allegro.ai"
4844
ENV EFLOMAL_PATH=/workspaces/silnlp/.venv/lib/python3.10/site-packages/eflomal/bin

.devcontainer/devcontainer.json

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -81,7 +81,7 @@
8181
]
8282
}
8383
},
84-
"postStartCommand": "poetry install && sh /workspaces/silnlp/.devcontainer/update_hosts.sh && sh /workspaces/silnlp/minio_bucket_setup.sh"
84+
"postStartCommand": "poetry install && sh /workspaces/silnlp/.devcontainer/update_hosts.sh && sh /workspaces/silnlp/rclone_setup.sh minio"
8585
// Uncomment to connect as root instead. More info: https://aka.ms/dev-containers-non-root.
8686
// "remoteUser": "root"
8787
}

README.md

Lines changed: 42 additions & 38 deletions
Original file line numberDiff line numberDiff line change
@@ -23,6 +23,24 @@ These are the main requirements for the SILNLP code to run on a local machine. S
2323
| Environment variables | To tell SILNLP where to find the data, etc. |
2424

2525
## Environment Setup
26+
Create file for environment variables
27+
28+
Create a text file with the following content and edit as necessary:
29+
```
30+
CLEARML_API_HOST="https://api.sil.hosted.allegro.ai"
31+
CLEARML_API_ACCESS_KEY=xxxxxxx
32+
CLEARML_API_SECRET_KEY=xxxxxxx
33+
MINIO_ENDPOINT_URL=https://truenas.psonet.languagetechnology.org:9000
34+
MINIO_ACCESS_KEY=xxxxxxxxx
35+
MINIO_SECRET_KEY=xxxxxxx
36+
B2_ENDPOINT_URL=https://s3.us-east-005.backblazeb2.com
37+
B2_KEY_ID=xxxxxxxx
38+
B2_APPLICATION_KEY=xxxxxxxx
39+
```
40+
* Include SIL_NLP_DATA_PATH="/silnlp" if you are not using MinIO or B2 and will be storing files locally.
41+
* If you do not intend to use SILNLP with ClearML, MinIO, and/or B2, you can leave out the respective variables. If you need to generate ClearML credentials, see [ClearML setup](clear_ml_setup.md).
42+
* Note that this does not give you direct access to a MinIO or B2 bucket from within the Docker container, it only allows you to run scripts referencing files in the bucket.
43+
2644
### Option 1: Docker
2745
1. If using a local GPU, install the corresponding [NVIDIA driver](https://www.nvidia.com/download/index.aspx)
2846

@@ -49,43 +67,46 @@ These are the main requirements for the SILNLP code to run on a local machine. S
4967

5068
If you're using a local GPU, then in a terminal, run:
5169
```
52-
docker create -it --gpus all --name silnlp ghcr.io/sillsdev/silnlp:latest
70+
docker create -it --gpus all --name silnlp --device /dev/fuse --cap-add SYS_ADMIN --env-file path/to/env-vars-file --security-opt apparmor=path/to/docker-apparmor ghcr.io/sillsdev/silnlp:latest
5371
```
5472
Otherwise, run:
5573
```
56-
docker create -it --name silnlp ghcr.io/sillsdev/silnlp:latest
74+
docker create -it --name silnlp --device /dev/fuse --cap-add SYS_ADMIN --env-file path/to/env-vars-file --security-opt apparmor=path/to/docker-apparmor ghcr.io/sillsdev/silnlp:latest
5775
```
58-
A docker container should be created. You should be able to see a container named 'silnlp' on the Containers page of Docker Desktop.
76+
If you do not intend to use SILNLP with ClearML, MinIO, and/or B2, you can exclude the --device, --cap-add, and --security-opt flags.
5977

60-
5. Create file for environment variables
78+
You will need to replace the placehoders "path/to/env-vars-file" and "path/to/docker-apparmor" with the respective real paths.
79+
* The env-vars-file is the file you created at the beginning of the Environment Setup section.
80+
* The docker-apparmor file is in the silnlp repo. You do not need to clone the entire repo, just download the docker-apparmor file.
6181

62-
Create a text file with the following content and edit as necessary:
63-
```
64-
CLEARML_API_HOST="https://api.sil.hosted.allegro.ai"
65-
CLEARML_API_ACCESS_KEY=xxxxxxx
66-
CLEARML_API_SECRET_KEY=xxxxxxx
67-
MINIO_ENDPOINT_URL=https://truenas.psonet.languagetechnology.org:9000
68-
MINIO_ACCESS_KEY=xxxxxxxxx
69-
MINIO_SECRET_KEY=xxxxxxx
70-
B2_ENDPOINT_URL=https://s3.us-east-005.backblazeb2.com
71-
B2_KEY_ID=xxxxxxxx
72-
B2_APPLICATION_KEY=xxxxxxxx
73-
```
74-
* Include SIL_NLP_DATA_PATH="/silnlp" if you are not using MinIO or B2 and will be storing files locally.
75-
* If you do not intend to use SILNLP with ClearML, MinIO, and/or B2, you can leave out the respective variables. If you need to generate ClearML credentials, see [ClearML setup](clear_ml_setup.md).
76-
* Note that this does not give you direct access to a MinIO or B2 bucket from within the Docker container, it only allows you to run scripts referencing files in the bucket.
82+
A docker container should be created. You should be able to see a container named 'silnlp' on the Containers page of Docker Desktop.
7783

7884
6. Start container
7985

8086
In a terminal, run:
8187
```
8288
docker start silnlp
83-
docker exec -it --env-file path/to/env_vars_file silnlp bash
89+
docker exec -it silnlp bash
8490
```
8591

8692
* After this step, the terminal should change to say `root@xxxxx:~/silnlp#`, where `xxxxx` is a string of letters and numbers, instead of your current working directory. This is the command line for the docker container, and you're able to run SILNLP scripts from here.
8793
* To leave the container, run `exit`, and to stop it, run `docker stop silnlp`. It can be started again by repeating step 6. Stopping the container will not erase any changes made in the container environment, but removing it will.
8894

95+
7. (Optional) Mount the rclone bucket
96+
97+
While in the /root/silnlp directory (the default on startup), run the following command:
98+
99+
If you are using rclone to mount MinIO (This is the default option):
100+
```
101+
./rclone_setup.sh minio
102+
```
103+
If you are using rclone to mount Backblaze (Only used as a backup option):
104+
```
105+
./rclone_setup.sh backblaze
106+
```
107+
108+
This will mount the specified bucket within the docker container.
109+
89110
### Option 2: Conda
90111
1. If using a local GPU, install the corresponding [NVIDIA driver](https://www.nvidia.com/download/index.aspx)
91112

@@ -132,24 +153,7 @@ These are the main requirements for the SILNLP code to run on a local machine. S
132153
poetry install
133154
```
134155
135-
10. If using ClearML, MinIO, and/or B2, set the following environment variables:
136-
```
137-
CLEARML_API_HOST="https://api.sil.hosted.allegro.ai"
138-
CLEARML_API_ACCESS_KEY=xxxxxxx
139-
CLEARML_API_SECRET_KEY=xxxxxxx
140-
MINIO_ENDPOINT_URL=https://truenas.psonet.languagetechnology.org:9000
141-
MINIO_ACCESS_KEY=xxxxxxxxx
142-
MINIO_SECRET_KEY=xxxxxxx
143-
B2_ENDPOINT_URL=https://s3.us-east-005.backblazeb2.com
144-
B2_KEY_ID=xxxxxxxx
145-
B2_APPLICATION_KEY=xxxxxxxx
146-
```
147-
* Include SIL_NLP_DATA_PATH="/silnlp" if you are not using MinIO or B2 and will be storing files locally.
148-
* If you need to generate ClearML credentials, see [ClearML setup](clear_ml_setup.md).
149-
* Note that this does not give you direct access to a MinIO or B2 bucket from within the Docker container, it only allows you to run scripts referencing files in the bucket.
150-
* For instructions on how to permanently set up environment variables for your operating system, see the corresponding section under the Development Environment Setup header below.
151-
152-
11. If using MinIO or B2, you will need to set up rclone:
156+
10. If using MinIO or B2, you will need to set up rclone:
153157
* Mount the bucket to your filesystem following the instructions under [Install and Configure Rclone](https://github.com/sillsdev/silnlp/blob/master/bucket_setup.md#install-and-configure-rclone).
154158
155159
## Development Environment Setup

backblaze_bucket_setup.sh

Lines changed: 0 additions & 8 deletions
This file was deleted.

minio_bucket_setup.sh

Lines changed: 0 additions & 8 deletions
This file was deleted.

rclone_setup.sh

Lines changed: 24 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,24 @@
1+
#!/bin/bash
2+
apt-get install --no-install-recommends -y fuse3 rclone
3+
mkdir -p /root/.config/rclone
4+
cp scripts/rclone/rclone.conf /root/.config/rclone/
5+
BUCKET_TYPE=$1
6+
if [ "$BUCKET_TYPE" = "minio" ]; then
7+
export SIL_NLP_DATA_PATH="/root/M"
8+
mkdir -p /root/M
9+
sed -i -e "s#access_key_id = x*#access_key_id = $MINIO_ACCESS_KEY#" /root/.config/rclone/rclone.conf
10+
sed -i -e "s#secret_access_key = x*#secret_access_key = $MINIO_SECRET_KEY#" /root/.config/rclone/rclone.conf
11+
12+
echo "Mounting MinIO bucket..."
13+
rclone mount --daemon --log-file=rclone_log.txt --log-level=DEBUG --vfs-cache-mode full --use-server-modtime miniosilnlp:nlp-research ~/M
14+
echo "Done"
15+
elif [ "$BUCKET_TYPE" = "backblaze" ]; then
16+
export SIL_NLP_DATA_PATH="/root/B"
17+
mkdir -p /root/B
18+
sed -i -e "s#account = x*#account = $B2_KEY_ID#" /root/.config/rclone/rclone.conf
19+
sed -i -e "s#key = x*#key = $B2_APPLICATION_KEY#" /root/.config/rclone/rclone.conf
20+
21+
echo "Mounting Backblaze bucket..."
22+
rclone mount --daemon --log-file=rclone_log.txt --log-level=DEBUG --vfs-cache-mode full --use-server-modtime b2silnlp:silnlp ~/B
23+
echo "Done"
24+
fi

0 commit comments

Comments
 (0)