Releases: explosion/spaCy
Releases · explosion/spaCy
v1.1.0: Bug fixes and adjustments
✨ Major features and improvements
- Rename new
pipelinekeyword argument ofspacy.load()tocreate_pipeline. - Rename new
vectorskeyword argument ofspacy.load()toadd_vectors.
🔴 Bug fixes
- Fix issue #544: Add
vocab.resize_vectors()method, to support changing to vectors of different dimensionality. - Fix issue #536: Default probability was incorrect for OOV words.
- Fix issue #539: Unspecified encoding when opening some JSON files.
- Fix issue #541: GloVe vectors were being loaded incorrectly.
- Fix issue #522: Similarities and vector norms were calculated incorrectly.
- Fix issue #461:
ent_iobattribute was incorrect after setting entities viadoc.ents - Fix issue #459: Deserialiser failed on empty doc
- Fix issue #514: Serialization failed after adding a new entity label.
v1.0.0: Support for deep learning workflows and entity-aware rule matcher
✨ Major features and improvements
- NEW: custom processing pipelines, to support deep learning workflows
- NEW: Rule matcher now supports entity IDs and attributes
- NEW: Official/documented training APIs and
GoldParseclass - Download and use GloVe vectors by default
- Make it easier to load and unload word vectors
- Improved rule matching functionality
- Move basic data into the code, rather than the json files. This makes it simpler to use the tokenizer without the models installed, and makes adding new languages much easier.
- Replace file-system strings with
Pathobjects. You can now load resources over your network, or do similar trickery, by passing any object that supports thePathprotocol.
⚠️ Backwards incompatibilities
- The data_dir keyword argument of
Language.__init__(and its subclassesEnglish.__init__andGerman.__init__) has been renamed topath. - Details of how the Language base-class and its sub-classes are loaded, and how defaults are accessed, have been heavily changed. If you have your own subclasses, you should review the changes.
- The deprecated
token.repvecname has been removed. - The
.train()method of Tagger and Parser has been renamed to.update() - The previously undocumented
GoldParseclass has a new__init__()method. The old method has been preserved inGoldParse.from_annot_tuples(). - Previously undocumented details of the
Parserclass have changed. - The previously undocumented
get_packageandget_package_by_namehelper functions have been moved into a new module,spacy.deprecated, in case you still need them while you update.
🔴 Bug fixes
- Fix
get_lang_classbug when GloVe vectors are used. - Fix Issue #411:
doc.sentsraised IndexError on empty string. - Fix Issue #455: Correct lemmatization logic
- Fix Issue #371: Make
Lexemeobjects hashable - Fix Issue #469: Make
noun_chunksdetect root NPs
👥 Contributors
Thanks to @daylen, @RahulKulhari, @stared, @adamhadani, @izeye and @crawfordcomeaux for the pull requests!