-
Notifications
You must be signed in to change notification settings - Fork 0
Bacpop-190 Move databses to mrcdata #44
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
EmmaLRussell
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Clever stuff! Works great. Suggested a little extra comment, just to make it obvious what's going on.
Maybe this script should be renamed download_databases as it's plural now..
So there's a GPS and a GBS db on mrcdata at the moment - GBS one is strep?
You probably want to update .gitignore to just ignore the whole storage folder - currently ignoring GPS files only.
scripts/download_db
Outdated
| # Define color codes | ||
| GREEN='\e[32m' | ||
| YELLOW='\e[33m' | ||
| RED='\e[31m' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This never gets used!
scripts/download_db
Outdated
| wget --progress=dot:giga $URL -O $DBBZ2 | ||
| # Unpack the file and place it in the storage directory | ||
| echo -e "${GREEN}Unpacking $FILE to $DEST${NC}" | ||
| mkdir -p $DEST |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You shouldn't need to do this in a loop though I guess it doesn't hurt. Could do it before the while, if there are any files.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i just had to have to here because we don't want to run this if the DB is already downloaded
| mkdir -p $DEST | ||
|
|
||
| (cd $DEST && tar -xf $DBBZ2 && rm -f $DBBZ2) | ||
| # Fetch the HTML content of the URL |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| # Fetch the HTML content of the URL | |
| # We'll download all tar.gz files named in the directory listing page at BASE_URL | |
| # Fetch the HTML content of the URL |
scripts/download_db
Outdated
| grep -E '\.tar\.gz$' | \ | ||
| # Loop over each file URL and download it | ||
| while read -r FILE; do | ||
| DEST_DIR=$DEST/$(basename $FILE .tar.gz) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know this predates these changes, but I find the DEST/DEST_DIR distinction a bit confusing. Could we just rename $DEST to $STORAGE?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i have updated to DBS_DEST... as storage location and dbs location are separate so may cause confusion
| docker volume create $VOLUME | ||
| docker run --rm -v $VOLUME:/beebop/storage $TAG_SHA \ | ||
| ./scripts/download_db --small storage | ||
| ./scripts/download_databases --small storage |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't need the small flag anymore i guess!
The data bases have now been moved to mrc data and the download_db script now downloads all databases in the beebop location