Skip to content

Commit bb8d3a9

Browse files
author
maverick
committed
Merge branch 'release/0.1.7'
2 parents 713b0bd + 05d220d commit bb8d3a9

33 files changed

+1073
-799
lines changed

CHANGELOG.md

Lines changed: 21 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,24 @@
1+
## 0.1.7
2+
### New Features
3+
* Created core/DEFAULT-TARGETV constant; use in clj_frontend
4+
* Added with-ns macro to support c4 runtime dependency resolution
5+
* Added support for quoting filenames, to permit filenames that otherwise look like drake rules
6+
* Added `--var x=y --var a=b` syntax, as better version of `--vars x=y,a=b`
7+
* Added `--graph` option to help visualize a workflow
8+
* bugfix: [#157] (https://github.com/Factual/drake/issues/157)
9+
* Added [handy drake script](https://github.com/Factual/drake/blob/master/bin/drake) for convenience and better customized hadoop client version.
10+
11+
### Bug Fixes
12+
* Allow unindented blank lines in step definitions, as per [#72](https://github.com/Factual/drake/issues/129)
13+
* Fix some bugs related to parsing, and to over-optimistically running shell expansions
14+
15+
### Maintenance / Generic Improvements
16+
* Made it easier to run drake from inside a clojure repl/nrepl session (removing System/exit calls)
17+
* Generally improved error-message output, esp. removing repetitive or useless spam and adding line/column number to parse errors
18+
* Upgraded to Clojure 1.6
19+
* Numerous performance improvements, which may actually be noticeable on large workflows
20+
* Changed default value for `BASE`
21+
122
## 0.1.6
223

324
* Add [a Clojure frontend](https://github.com/Factual/drake/wiki/A-Clojure-Frontend-to-Drake) (thanks morrifeldman)

README.md

Lines changed: 10 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -26,7 +26,7 @@ We've not tested it on other operating systems.
2626

2727
You can build Drake from source, which is the preferred way to run the most
2828
up-to-date version, or you can
29-
[download a prebuilt uberjar](https://docs.google.com/uc?export=download&confirm=nT8F&id=0B2xtKcFEL6wwWnRzVzRZcGFFaWc)
29+
[download a prebuilt uberjar](https://github.com/Factual/drake/releases/download/v0.1.7/drake.jar)
3030

3131
,which may not be the most recent version of Drake.
3232

@@ -59,16 +59,19 @@ You can pass in arguments and options to Drake by putting them at the end of the
5959
$ java -jar drake.jar --version
6060
```
6161

62-
### A nicer way to run Drake
62+
### Use Drake as a Clojure library
6363

64-
We recommend you "install" Drake in your environment so that you can run it by just typing "drake". Here's a convenience script you can put on your path:
64+
You can programmatically use Drake from your Clojure project by using [Drake's Clojure front end](https://github.com/Factual/drake/wiki/A-Clojure-Frontend-to-Drake). Your project.clj dependencies should include the latest Drake library, e.g.:
6565

66-
```bash
67-
#!/bin/bash
68-
java -cp $(dirname $0)/drake.jar drake.core "$@"
66+
```clojure
67+
[factual/drake "0.1.7"]
6968
```
7069

71-
Save that as `drake`, then do `chmod 755 drake`. Move the uberjar to be in the same directory. Now you can just type `drake` to run Drake from anywhere.
70+
### A nicer way to run Drake
71+
72+
For command line usage, we provided a handy [bash script for drake](http://github.com/Factual/drake/blob/master/bin/drake). You can do either of following
73+
* Get the script and drake jar, set jar folder as $DRAKE_HOME
74+
* Clone the repo and run script from {project_root}/bin. If you make a symlink, set $DRAKE_HOME to project root folder, now you can just type `drake` to run Drake from anywhere. (You may also need leiningen installed for building uberjar.)
7275

7376
### Faster startup time
7477

bin/drake

Lines changed: 15 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -1,18 +1,28 @@
11
#!/bin/bash
22
# Runs Drake without Nailgun (needs an uberjar)
3-
DRAKE_HOME=$(dirname $(dirname $0))
4-
DRAKE_JAR=$DRAKE_HOME/target/drake.jar
3+
DRAKE_HOME=${DRAKE_HOME-$(dirname $(dirname $0))}
4+
echo "Using DRAKE_HOME: ${DRAKE_HOME}"
5+
6+
DRAKE_JAR=$DRAKE_HOME/drake.jar
7+
8+
if [[ ! -f $DRAKE_JAR ]]; then
9+
DRAKE_JAR=$DRAKE_HOME/target/drake.jar
10+
fi
511

612
if [[ ! -f $DRAKE_JAR ]]; then
713
echo "========= UBERJAR COMPILING ========="
814
cd $DRAKE_HOME
915
lein uberjar
16+
cd - > /dev/null
1017
echo "========= UBERJAR COMPILED ========="
1118
fi
1219

20+
if [[ `which hadoop` ]]; then
21+
HADOOP_CLASSPATH=`hadoop classpath 2>/dev/null`
22+
fi
23+
1324
if [[ `which drip` ]]; then
14-
drip -jar $DRAKE_JAR "$@"
25+
drip -cp ${HADOOP_CLASSPATH}:${DRAKE_JAR} drake.core "$@"
1526
else
16-
java -jar $DRAKE_JAR "$@"
27+
java -cp ${HADOOP_CLASSPATH}:${DRAKE_JAR} drake.core "$@"
1728
fi
18-

bin/run.sh

Lines changed: 0 additions & 3 deletions
This file was deleted.

demos/clj-frontend/project.clj

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4,5 +4,5 @@
44
:license {:name "Eclipse Public License"
55
:url "http://www.eclipse.org/legal/epl-v10.html"}
66
:dependencies [[org.clojure/clojure "1.5.1"]
7-
[factual/drake "0.1.5"]]
7+
[factual/drake "0.1.6"]]
88
:repl-options {:init-ns clj-frontend.demo})

demos/clj-frontend/src/clj_frontend/demo.clj

Lines changed: 9 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -35,7 +35,7 @@
3535
[]
3636
["echo \"This is the first output.\" > $OUTPUT0"
3737
"echo \"This is the second output.\" > $OUTPUT1"] ;multiple commands
38-
:timecheck false) ;options are key value pairs
38+
{:timecheck false}) ; the last argument to step and method is an option map
3939
(method
4040
"test_method"
4141
["echo \"Here we are using a method.\" > $OUTPUT"])
@@ -53,7 +53,13 @@
5353
;; $[XXX] substitution is allowed in commands.
5454
["echo \"This is the third output.\" > $OUTPUT"
5555
"echo \"test_var is set to $test_var - $[test_var].\" >> $OUTPUT"
56-
"echo \"The file $INPUT contains:\" | cat - $INPUT >> $[OUTPUT]"])))
56+
"echo \"The file $INPUT contains:\" | cat - $INPUT >> $[OUTPUT]"])
57+
(cmd-step
58+
["output"]
59+
[]
60+
["echo $MSG > $OUTPUT"]
61+
{:timecheck false,
62+
:vars {"MSG" "My Message"}}))) ; you can also set vars in the option map
5763

5864
;; (run-workflow advanced-workflow :preview true)
5965
;; (run-workflow advanced-workflow)
@@ -96,7 +102,7 @@
96102
["raw_data"]
97103
[]
98104
["wget -O $OUTPUT " url] ;get the data
99-
:timecheck false)
105+
{:timecheck false})
100106
(cmd-step
101107
["sorted_data"]
102108
["raw_data"]

project.clj

Lines changed: 11 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -1,31 +1,34 @@
1-
(defproject factual/drake "0.1.5"
1+
(defproject factual/drake "0.1.7"
22
:description "Drake: the data processing workflow tool (a.k.a. 'make for data')"
33
:url "https://github.com/Factual/drake"
44
:license {:name "Eclipse Public License"
55
:url "http://www.eclipse.org/legal/epl-v10.html"}
6-
:dependencies [[org.clojure/clojure "1.4.0"]
6+
:scm {:name "git"
7+
:url "https://github.com/Factual/drake"}
8+
:signing {:gpg-key "1402451C"}
9+
:deploy-repositories [["clojars" {:creds :gpg}]]
10+
:dependencies [[org.clojure/clojure "1.6.0"]
711
[org.clojure/core.memoize "0.5.6"]
812
[factual/drake-interface "0.0.1"]
913
[org.clojure/tools.logging "0.2.3"]
1014
[clj-logging-config "1.9.6"]
11-
[clojopts/clojopts "0.3.2"]
15+
[clojopts/clojopts "0.3.4"]
16+
[org.flatland/useful "0.11.3"]
1217
[fs "1.3.2"]
1318
[factual/jlk-time "0.1"]
1419
[clj-time "0.6.0"]
1520
[digest "1.4.0"]
1621
[com.google.guava/guava "14.0.1"]
1722
[cheshire "5.2.0"]
23+
[rhizome "0.2.5"]
1824
[slingshot "0.10.3"]
1925
[factual/fnparse "2.3.0"]
2026
[commons-codec/commons-codec "1.6"]
2127
[factual/sosueme "0.0.15"]
2228
[factual/c4 "0.2.0"]
23-
;; for HDFS support
24-
[hdfs-clj "0.1.0"]
25-
;; you may need to change this to be compatible with your cluster
29+
[hdfs-clj "0.1.3"] ;; for HDFS support
2630
[org.apache.hadoop/hadoop-core "0.20.2"]
27-
;; for AWS S3 support
28-
[clj-aws-s3 "0.3.3"]
31+
[clj-aws-s3 "0.3.10"] ;; for AWS S3 support
2932
;; for plugins
3033
[com.cemerick/pomegranate "0.2.0"]]
3134
:test-selectors {:regression :regression

release-checklist.md

Lines changed: 5 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1,10 +1,11 @@
11
* Update Changelog
22
* Update tests
3-
* Update core VERSION tag
3+
* Update VERSION constant in core.clj
44
* Update project.clj version tag
5-
* Push new uberjar
6-
* Update README uberjar link
5+
* Release as library to Clojars
6+
* Update README project.clj dependency example
77
* Update README feature docs
88
* Update wiki feature docs
9-
* Create release label
9+
* Create release label, using github release feature
10+
* Update README uberjar link
1011
* Notify newsgroup

src/drake/clj_frontend.clj

Lines changed: 38 additions & 27 deletions
Original file line numberDiff line numberDiff line change
@@ -25,22 +25,27 @@
2525
by an opening %. cmds should be a vector of strings or nil/false for
2626
steps that don't need commands like method and template
2727
steps. Standard drake options can be appended inline as key value
28-
pairs, e.g. :method-mode true. See method-step, cmd-step, template
29-
and template-step."
30-
[w-flow outputs inputs cmds & {:keys [template]
31-
:as options}]
32-
(let [vars (:vars w-flow)
28+
pairs, e.g. :method-mode true; or (preferred) you may pass a single
29+
option-map argument. Variables for the step can be given in
30+
the :vars key and will be substituted. See method-step, cmd-step,
31+
template and template-step."
32+
[w-flow outputs inputs cmds & options]
33+
(let [{:keys [template] :as options} (utils/varargs->map options)
34+
w-flow-vars (:vars w-flow)
35+
step-vars (utils/var-sub-map w-flow-vars (:vars options))
36+
vars (merge w-flow-vars step-vars)
37+
options (dissoc options :vars)
3338
base (parse/add-path-sep-suffix
3439
(get vars "BASE" parse/default-base))
3540

3641
[intags infiles] (utils/split-tags-from-files inputs)
3742
intags (utils/remove-tag-symbol intags)
38-
sub-infiles (map (partial utils/var-sub->str vars) infiles)
43+
sub-infiles (map (partial utils/var-sub vars) infiles)
3944
infiles-with-base (map (partial parse/add-prefix base) sub-infiles)
4045

4146
[outtags outfiles] (utils/split-tags-from-files outputs)
4247
outtags (utils/remove-tag-symbol outtags)
43-
sub-outfiles (map (partial utils/var-sub->str vars) outfiles)
48+
sub-outfiles (map (partial utils/var-sub vars) outfiles)
4449
outfiles-with-base (map (partial parse/add-prefix base) sub-outfiles)
4550

4651
;; this is used for target matching, just remove all
@@ -79,7 +84,8 @@
7984
and specify :method and :method-mode options. See step for outputs
8085
inputs and options."
8186
[w-flow outputs inputs method-name & options]
82-
(apply step w-flow outputs inputs nil :method method-name options))
87+
(step w-flow outputs inputs nil (-> (utils/varargs->map options)
88+
(assoc :method method-name))))
8389

8490
(def cmd-step
8591
"Shortcut for adding a command step to the workflow, w-flow. See
@@ -90,27 +96,31 @@
9096
"Shortcut for adding a template to the workflow, w-flow. See step
9197
for outputs, inputs, cmds and options."
9298
[w-flow outputs inputs cmds & options]
93-
(apply step w-flow outputs inputs cmds :template true options))
99+
(step w-flow outputs inputs cmds (-> (utils/varargs->map options)
100+
(assoc :template true))))
94101

95102
(defn template-step
96103
"Shortcut for adding a step that uses a template to the workflow,
97104
w-flow. See step for outputs, inputs, cmds and options"
98105
[w-flow outputs inputs & options]
99-
(apply step w-flow outputs inputs nil options))
106+
(step w-flow outputs inputs nil (utils/varargs->map options)))
100107

101108
(defn method
102109
"Add a method to the workflow, w-flow. method-name should be a
103110
string and cmds should be a vector of command strings. Options are
104-
standard drake options as key value pairs, e.g. :my-option
105-
\"my-value\""
106-
[w-flow method-name cmds & {:as options}]
111+
standard drake options as vararg key value pairs, e.g. :my-option
112+
\"my-value\", or (preferred) a single map. Variables for the method can be
113+
given in the :vars key and will be substituted."
114+
[w-flow method-name cmds & options]
107115
(when ((:methods w-flow) method-name)
108116
(println (format "Warning: method redefinition ('%s')" method-name)))
109-
(assoc-in w-flow [:methods method-name] {:opts (if (nil? options)
110-
{}
111-
options)
112-
:vars (:vars w-flow)
113-
:cmds (mapv utils/var-place cmds)}))
117+
(let [options (utils/varargs->map options)
118+
w-flow-vars (:vars w-flow)
119+
step-vars (utils/var-sub-map w-flow-vars (:vars options))
120+
vars (merge w-flow-vars step-vars)]
121+
(assoc-in w-flow [:methods method-name] {:opts (dissoc options :vars)
122+
:vars vars
123+
:cmds (mapv utils/var-place cmds)})))
114124

115125
(defn add-methods
116126
"Adds all the methods in methods-hash to workflow, w-flow.
@@ -129,8 +139,8 @@
129139
should be strings"
130140
[w-flow var-name var-value]
131141
(let [vars (:vars w-flow)
132-
sub-var-name (utils/var-sub->str vars var-name)
133-
sub-var-value (utils/var-sub->str vars var-value)]
142+
sub-var-name (utils/var-sub vars var-name)
143+
sub-var-value (utils/var-sub vars var-value)]
134144
(assoc-in w-flow [:vars sub-var-name] sub-var-value)))
135145

136146
(defn base
@@ -140,15 +150,16 @@
140150

141151
(defn run-workflow
142152
"Run the workflow in w-flow. Optionally specify targetv as a
143-
key value pair, e.g. :targetv [\"=...\"]\", otherwise the default
144-
targetv is [\"=...\"]\". Other run options to run-workflow can also
145-
be specified as key value pairs. Set :repl-feedback to :quiet,
146-
:default or :verbose to adjust the repl feedback level."
153+
key value pair, e.g. :targetv [\"outA.txt\" \"outB.txt\" \"outC.txt]\",
154+
otherwise the default targetv is core/DEFAULT-TARGETV.
155+
Other run options to run-workflow can also be specified as key value pairs.
156+
Set :repl-feedback to :quiet, :default or :verbose to adjust the repl feedback
157+
level."
147158
[w-flow & {:keys [targetv repl-feedback]
148-
:or {targetv ["=..."]
159+
:or {targetv d-core/DEFAULT-TARGETV
149160
repl-feedback :default}
150161
:as run-options}]
151-
(let [opts (merge d-core/DEFAULT-OPTIONS
162+
(let [opts (merge d-opts/DEFAULT-OPTIONS
152163
{:auto true}
153164
run-options)
154165
opts-with-eb (if (not= repl-feedback :quiet)
@@ -162,7 +173,7 @@
162173
(info "Clojure version:" *clojure-version*)
163174
(info "Options:" opts-with-eb)
164175

165-
(plug/load-plugin-deps (*options* :plugins))
176+
(plug/load-plugin-deps (:plugins *options*))
166177
(fs/with-cwd fs/*cwd*
167178
(-> w-flow
168179
(utils/compile-parse-tree)

0 commit comments

Comments
 (0)