Skip to content

Commit f37fbe8

Browse files
Merge pull request #115 from pepkit/dev
Release 0.12.2
2 parents cb1c4e4 + 3ad1dda commit f37fbe8

File tree

31 files changed

+282
-6160
lines changed

31 files changed

+282
-6160
lines changed

.github/workflows/run-codecov.yml

Lines changed: 21 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1,21 +1,38 @@
11
name: Run codecov
22

33
on:
4+
push:
5+
branches: [dev]
46
pull_request:
5-
branches: [master, dev]
7+
branches: [master]
68

79
jobs:
810
pytest:
911
runs-on: ${{ matrix.os }}
1012
strategy:
1113
matrix:
12-
python-version: [3.9]
14+
python-version: [3.11]
1315
os: [ubuntu-latest]
1416

1517
steps:
1618
- uses: actions/checkout@v2
19+
20+
- name: Set up Python ${{ matrix.python-version }}
21+
uses: actions/setup-python@v2
22+
with:
23+
python-version: ${{ matrix.python-version }}
24+
25+
- name: Install test dependencies
26+
run: if [ -f requirements/requirements-test.txt ]; then pip install -r requirements/requirements-test.txt; fi
27+
28+
- name: Install package
29+
run: python -m pip install .
30+
31+
- name: Run pytest tests
32+
run: pytest tests --cov=./ --cov-report=xml
33+
1734
- name: Upload coverage to Codecov
18-
uses: codecov/codecov-action@v2
35+
uses: codecov/codecov-action@v3
1936
with:
2037
file: ./coverage.xml
21-
name: py-${{ matrix.python-version }}-${{ matrix.os }}
38+
name: py-${{ matrix.python-version }}-${{ matrix.os }}

.github/workflows/run-pytest.yml

Lines changed: 1 addition & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -11,7 +11,7 @@ jobs:
1111
runs-on: ${{ matrix.os }}
1212
strategy:
1313
matrix:
14-
python-version: ["3.8", "3.9", "3.10"]
14+
python-version: ["3.8", "3.11"]
1515
os: [ubuntu-latest]
1616

1717
steps:
@@ -22,9 +22,6 @@ jobs:
2222
with:
2323
python-version: ${{ matrix.python-version }}
2424

25-
- name: Install dev dependencies
26-
run: if [ -f requirements/requirements-dev.txt ]; then pip install -r requirements/requirements-dev.txt; fi
27-
2825
- name: Install test dependencies
2926
run: if [ -f requirements/requirements-test.txt ]; then pip install -r requirements/requirements-test.txt; fi
3027

.gitignore

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -3,6 +3,7 @@
33

44
# Python
55
*.pyc
6+
build/
67

78
# ignore test results
89
tests/test/*
@@ -94,4 +95,4 @@ docs_jupyter/*
9495
.env/
9596
env/
9697
.venv/
97-
venv/
98+
venv/

docs/changelog.md

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,9 @@
11
# Changelog
22

3+
## [0.12.2] -- 2023-04-25
4+
- Added `max-prefetch-size` argument. #113
5+
- Improved code and logger structure.
6+
37
## [0.12.0] -- 2023-03-27
48
- Added functionality that saves gse metadata to config file
59
- Fixed description in initialization of pepy object

docs/gse_finder.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -17,7 +17,7 @@ from geofetch import Finder
1717
gse_obj = Finder()
1818

1919
# Optionally: provide filter string and max number of retrieve elements
20-
gse_obj = Finder(filter="((bed) OR narrow peak) AND Homo sapiens[Organism]", retmax=10)
20+
gse_obj = Finder(filters="((bed) OR narrow peak) AND Homo sapiens[Organism]", retmax=10)
2121
```
2222

2323
1) Get list of all GSE in GEO

docs/usage.md

Lines changed: 86 additions & 67 deletions
Original file line numberDiff line numberDiff line change
@@ -1,108 +1,127 @@
1-
# Usage reference
1+
# <img src="./img/geofetch_logo.svg" class="img-header"> usage reference
2+
3+
`geofetch` command-line usage instructions:
24

3-
geofetch command-line usage instructions:
45

5-
`geofetch -V`
6-
```console
7-
geofetch 0.11.0
8-
```
96

107
`geofetch --help`
11-
```console
12-
usage: geofetch [-h] [-V] -i INPUT [-n NAME] [-m METADATA_ROOT] [-u METADATA_FOLDER]
13-
[--just-metadata] [-r] [--config-template CONFIG_TEMPLATE]
14-
[--pipeline-samples PIPELINE_SAMPLES] [--pipeline-project PIPELINE_PROJECT]
15-
[--disable-progressbar] [-k SKIP] [--acc-anno] [--discard-soft]
16-
[--const-limit-project CONST_LIMIT_PROJECT]
17-
[--const-limit-discard CONST_LIMIT_DISCARD]
18-
[--attr-limit-truncate ATTR_LIMIT_TRUNCATE] [--add-dotfile] [-p]
19-
[--data-source {all,samples,series}] [--filter FILTER]
20-
[--filter-size FILTER_SIZE] [-g GEO_FOLDER] [-x] [-b BAM_FOLDER]
21-
[-f FQ_FOLDER] [--use-key-subset] [--silent] [--verbosity V] [--logdev]
8+
```{console}
9+
usage: geofetch [<args>]
10+
11+
The example how to use geofetch (to download GSE573030 just metadata):
12+
geofetch -i GSE67303 -m <folder> --just-metadata
13+
14+
To download all processed data of GSE57303:
15+
geofetch -i GSE67303 --processed --geo-folder <folder> -m <folder>
2216
2317
Automatic GEO and SRA data downloader
2418
25-
optional arguments:
19+
options:
2620
-h, --help show this help message and exit
2721
-V, --version show program's version number and exit
2822
-i INPUT, --input INPUT
29-
required: a GEO (GSE) accession, or a file with a list of GSE
30-
numbers
23+
required: a GEO (GSE) accession, or a file with a list
24+
of GSE numbers
3125
-n NAME, --name NAME Specify a project name. Defaults to GSE number
3226
-m METADATA_ROOT, --metadata-root METADATA_ROOT
33-
Specify a parent folder location to store metadata. The project name
34-
will be added as a subfolder [Default: $SRAMETA:]
27+
Specify a parent folder location to store metadata.
28+
The project name will be added as a subfolder
29+
[Default: $SRAMETA:]
3530
-u METADATA_FOLDER, --metadata-folder METADATA_FOLDER
36-
Specify an absolute folder location to store metadata. No subfolder
37-
will be added. Overrides value of --metadata-root [Default: Not used
38-
(--metadata-root is used by default)]
39-
--just-metadata If set, don't actually run downloads, just create metadata
31+
Specify an absolute folder location to store metadata.
32+
No subfolder will be added. Overrides value of
33+
--metadata-root.
34+
--just-metadata If set, don't actually run downloads, just create
35+
metadata
4036
-r, --refresh-metadata
4137
If set, re-download metadata even if it exists.
4238
--config-template CONFIG_TEMPLATE
4339
Project config yaml file template.
4440
--pipeline-samples PIPELINE_SAMPLES
45-
Optional: Specify one or more filepaths to SAMPLES pipeline
46-
interface yaml files. These will be added to the project config file
47-
to make it immediately compatible with looper. [Default: null]
41+
Optional: Specify one or more filepaths to SAMPLES
42+
pipeline interface yaml files. These will be added to
43+
the project config file to make it immediately
44+
compatible with looper. [Default: null]
4845
--pipeline-project PIPELINE_PROJECT
49-
Optional: Specify one or more filepaths to PROJECT pipeline
50-
interface yaml files. These will be added to the project config file
51-
to make it immediately compatible with looper. [Default: null]
46+
Optional: Specify one or more filepaths to PROJECT
47+
pipeline interface yaml files. These will be added to
48+
the project config file to make it immediately
49+
compatible with looper. [Default: null]
5250
--disable-progressbar
5351
Optional: Disable progressbar
5452
-k SKIP, --skip SKIP Skip some accessions. [Default: no skip].
55-
--acc-anno Optional: Produce annotation sheets for each accession. Project
56-
combined PEP for the whole project won't be produced.
57-
--discard-soft Optional: After creation of PEP files, all soft and additional files
53+
--acc-anno Optional: Produce annotation sheets for each
54+
accession. Project combined PEP for the whole project
55+
won't be produced.
56+
--discard-soft Optional: After creation of PEP files, all .soft files
5857
will be deleted
5958
--const-limit-project CONST_LIMIT_PROJECT
60-
Optional: Limit of the number of the constant sample characters that
61-
should not be in project yaml. [Default: 50]
59+
Optional: Limit of the number of the constant sample
60+
characters that should not be in project yaml.
61+
[Default: 50]
6262
--const-limit-discard CONST_LIMIT_DISCARD
63-
Optional: Limit of the number of the constant sample characters that
64-
should not be discarded [Default: 250]
63+
Optional: Limit of the number of the constant sample
64+
characters that should not be discarded [Default: 250]
6565
--attr-limit-truncate ATTR_LIMIT_TRUNCATE
66-
Optional: Limit of the number of sample characters.Any attribute
67-
with more than X characters will truncate to the first X, where X is
68-
a number of characters [Default: 500]
69-
--add-dotfile Optional: Add .pep.yaml file that points .yaml PEP file
66+
Optional: Limit of the number of sample characters.Any
67+
attribute with more than X characters will truncate to
68+
the first X, where X is a number of characters
69+
[Default: 500]
70+
--add-dotfile Optional: Add .pep.yaml file that points .yaml PEP
71+
file
72+
--max-soft-size MAX_SOFT_SIZE
73+
Optional: Max size of soft file. [Default: 1GB].
74+
Supported input formats : 12B, 12KB, 12MB, 12GB.
75+
--max-prefetch-size MAX_PREFETCH_SIZE
76+
Argument to pass to prefetch program's --max-size
77+
option, if prefetch will be used in this run of
78+
geofetch; for reference: https://github.com/ncbi/sra-
79+
tools/wiki/08.-prefetch-and-fasterq-dump#check-the-
80+
maximum-size-limit-of-the-prefetch-tool
7081
--silent Silence logging. Overrides verbosity.
7182
--verbosity V Set logging level (1-5 or logging module level name)
7283
--logdev Expand content of logging message format.
7384
7485
processed:
7586
-p, --processed Download processed data [Default: download raw data].
7687
--data-source {all,samples,series}
77-
Optional: Specifies the source of data on the GEO record to retrieve
78-
processed data, which may be attached to the collective series
79-
entity, or to individual samples. Allowable values are: samples,
80-
series or both (all). Ignored unless 'processed' flag is set.
81-
[Default: samples]
82-
--filter FILTER Optional: Filter regex for processed filenames [Default:
83-
None].Ignored unless 'processed' flag is set.
88+
Optional: Specifies the source of data on the GEO
89+
record to retrieve processed data, which may be
90+
attached to the collective series entity, or to
91+
individual samples. Allowable values are: samples,
92+
series or both (all). Ignored unless 'processed' flag
93+
is set. [Default: samples]
94+
--filter FILTER Optional: Filter regex for processed filenames
95+
[Default: None].Ignored unless 'processed' flag is
96+
set.
8497
--filter-size FILTER_SIZE
85-
Optional: Filter size for processed files that are stored as sample
86-
repository [Default: None]. Works only for sample data. Supported
87-
input formats : 12B, 12KB, 12MB, 12GB. Ignored unless 'processed'
88-
flag is set.
98+
Optional: Filter size for processed files that are
99+
stored as sample repository [Default: None]. Works
100+
only for sample data. Supported input formats : 12B,
101+
12KB, 12MB, 12GB. Ignored unless 'processed' flag is
102+
set.
89103
-g GEO_FOLDER, --geo-folder GEO_FOLDER
90-
Optional: Specify a location to store processed GEO files. Ignored
91-
unless 'processed' flag is set.[Default: $GEODATA:]
104+
Optional: Specify a location to store processed GEO
105+
files. Ignored unless 'processed' flag is
106+
set.[Default: $GEODATA:]
92107
93108
raw:
94109
-x, --split-experiments
95-
Split SRR runs into individual samples. By default, SRX experiments
96-
with multiple SRR Runs will have a single entry in the annotation
97-
table, with each run as a separate row in the subannotation table.
98-
This setting instead treats each run as a separate sample
110+
Split SRR runs into individual samples. By default,
111+
SRX experiments with multiple SRR Runs will have a
112+
single entry in the annotation table, with each run as
113+
a separate row in the subannotation table. This
114+
setting instead treats each run as a separate sample
99115
-b BAM_FOLDER, --bam-folder BAM_FOLDER
100-
Optional: Specify folder of bam files. Geofetch will not download
101-
sra files when corresponding bam files already exist. [Default:
102-
$SRABAM:]
116+
Optional: Specify folder of bam files. Geofetch will
117+
not download sra files when corresponding bam files
118+
already exist. [Default: $SRABAM:]
103119
-f FQ_FOLDER, --fq-folder FQ_FOLDER
104-
Optional: Specify folder of fastq files. Geofetch will not download
105-
sra files when corresponding fastq files already exist. [Default:
106-
$SRAFQ:]
107-
--use-key-subset Use just the keys defined in this module when writing out metadata.
120+
Optional: Specify folder of fastq files. Geofetch will
121+
not download sra files when corresponding fastq files
122+
already exist. [Default: $SRAFQ:]
123+
--use-key-subset Use just the keys defined in this module when writing
124+
out metadata.
125+
--add-convert-modifier
126+
Add looper SRA convert modifier to config file.
108127
```

docs_jupyter/build/processed-data-downloading.md

Lines changed: 0 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -24,11 +24,6 @@ Calling geofetch will do 4 tasks:
2424

2525
Complete details about geofetch outputs is cataloged in the [metadata outputs reference](metadata_output.md).
2626

27-
from IPython.core.display import SVG
28-
SVG(filename='logo.svg')
29-
30-
![arguments_outputs.svg](attachment:arguments_outputs.svg)
31-
3227
## Download the data
3328

3429
First, create the metadata for processed data (by adding --processed and --just-metadata):

geofetch/__init__.py

Lines changed: 8 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,12 @@
11
""" Package-level data """
2-
from .geofetch import *
3-
from .finder import *
4-
from ._version import __version__
52
import logmuse
63

4+
from geofetch.geofetch import *
5+
from geofetch.finder import *
6+
from geofetch._version import __version__
7+
8+
9+
__author__ = ["Oleksandr Khoroshevskyi", "Vince Reuter", "Nathan Sheffield"]
10+
__all__ = ["Finder", "Geofetcher"]
11+
712
logmuse.init_logger("geofetch")

geofetch/__main__.py

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,10 @@
1+
import sys
2+
from geofetch.geofetch import main
3+
4+
if __name__ == "__main__":
5+
try:
6+
sys.exit(main())
7+
8+
except KeyboardInterrupt:
9+
print("Pipeline aborted.")
10+
sys.exit(1)

geofetch/_version.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1 +1 @@
1-
__version__ = "0.12.1"
1+
__version__ = "0.12.2"

0 commit comments

Comments
 (0)