Skip to content

Commit 52af6c8

Browse files
committed
update files to get annotated data
1 parent a131b6c commit 52af6c8

File tree

5 files changed

+9
-9
lines changed

5 files changed

+9
-9
lines changed

README.md

Lines changed: 0 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -7,11 +7,8 @@ solipCysme
77
| --- | --- |
88
| __Language__ | french |
99
| __Name__ | `fr_solipcysme` |
10-
| __Version__ | `0.2.5` |
11-
| __spaCy__ | `==3.8.4` |
1210
| __Default Pipeline__ | `jusqucy_tokenizer`,`commecy_normalizer`, `jusqucy_normalizer`, `pretagger_hunspell`,`morphologizer`, `viceverser_lemmatizer`, `parser` |
1311
| __Components__ | [jusqucy_tokenizer](https://github.com/thjbdvlt/jusquci), [jusqucy_normalizer](https://github.com/thjbdvlt/jusquci), [commecy_normalizer](https://github.com/thjbdvlt/commecy), `morphologizer`, [viceverser_lemmatizer](https://github.com/thjbdvlt/spacy-viceverser), `parser` |
14-
| __Vectors__ | 669785 keys, 6697856 unique vectors (100 dimensions) |
1512
| __Sources__ | Corpus [narraFEATS](https://github.com/thjbdvlt/corpus-narraFEATS) (morphologizer), [Universal Dependencies](https://universaldependencies.org/fr/) (parser), [french-word-vectors](https://github.com/thjbdvlt/french-word-vectors) (vectors)|
1613
| __License__ | [GPL](https://www.gnu.org/licenses/gpl-3.0.html) |
1714
| __Author__ | [thjbdvlt](https://github.com/thjbdvlt) |
Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1 +1 @@
1-
narrafeats
1+
narrafeats*

make_pipeline/morphologizer/get_data.sh

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -28,8 +28,8 @@ unzip() {
2828
untar
2929
}
3030
download() {
31-
local version=v0.1.1
32-
wget https://github.com/thjbdvlt/corpus-narraFEATS/releases/download/$version/narrafeats.tar.gz
31+
# wget https://github.com/thjbdvlt/corpus-narraFEATS/releases/download/$version/narrafeats.tar.gz
32+
wget https://github.com/thjbdvlt/corpus-narraFEATS/releases/latest/download/narrafeats.tar.gz
3333
unzip
3434
}
3535

make_pipeline/morphologizer/train.sh

Lines changed: 5 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -38,12 +38,14 @@ case "$size" in
3838
sm | md)
3939
width=128
4040
depth=3
41-
rows=[2000,500,1000,2000]
41+
features='["NORM","SENT_START"]'
42+
rows=[2000,500,500,1000,2000]
4243
static=false;;
4344
lg)
4445
width=256
4546
depth=4
46-
rows=[4000,1000,2000,4000]
47+
features='["NORM","SENT_START"]'
48+
rows=[2000,1000,1000,2000,4000]
4749
static=true;;
4850
*)
4951
echo "Unknown value for -s: $size" >&2
@@ -89,6 +91,7 @@ opts+=(
8991
--paths.init_tok2vec=${pretrain_model}
9092
--${encode}.width=${width}
9193
--${encode}.depth=${depth}
94+
--${embed}.features=${features}
9295
--${embed}.rows=${rows}
9396
--${embed}.include_static_vectors=${static}
9497
--paths.dev=${dev}

make_pipeline/package.sh

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -26,7 +26,7 @@ optional:
2626
size=
2727

2828
# Default values
29-
name=solipCysme
29+
name=solipcysme
3030
output=pipeline
3131
meta=meta.json
3232
raw=data/raw.txt

0 commit comments

Comments
 (0)