Releases · explosion/spaCy

🔴 Bug fixes

Fix issue #544: Add vocab.resize_vectors() method, to support changing to vectors of different dimensionality.

Fix issue #536: Default probability was incorrect for OOV words.

Fix issue #539: Unspecified encoding when opening some JSON files.

Fix issue #541: GloVe vectors were being loaded incorrectly.

Fix issue #522: Similarities and vector norms were calculated incorrectly.

Fix issue #461: ent_iob attribute was incorrect after setting entities via doc.ents

Fix issue #459: Deserialiser failed on empty doc

Fix issue #514: Serialization failed after adding a new entity label.

✨ Major features and improvements

NEW: custom processing pipelines, to support deep learning workflows
NEW: Rule matcher now supports entity IDs and attributes
NEW: Official/documented training APIs and GoldParse class
Download and use GloVe vectors by default
Make it easier to load and unload word vectors
Improved rule matching functionality
Move basic data into the code, rather than the json files. This makes it simpler to use the tokenizer without the models installed, and makes adding new languages much easier.
Replace file-system strings with Path objects. You can now load resources over your network, or do similar trickery, by passing any object that supports the Path protocol.

⚠️ Backwards incompatibilities

The data_dir keyword argument of Language.__init__ (and its subclasses English.__init__ and German.__init__) has been renamed to path.
Details of how the Language base-class and its sub-classes are loaded, and how defaults are accessed, have been heavily changed. If you have your own subclasses, you should review the changes.
The deprecated token.repvec name has been removed.
The .train() method of Tagger and Parser has been renamed to .update()
The previously undocumented GoldParse class has a new __init__() method. The old method has been preserved in GoldParse.from_annot_tuples().
Previously undocumented details of the Parser class have changed.
The previously undocumented get_package and get_package_by_name helper functions have been moved into a new module, spacy.deprecated, in case you still need them while you update.

🔴 Bug fixes

Fix get_lang_class bug when GloVe vectors are used.
Fix Issue #411: doc.sents raised IndexError on empty string.
Fix Issue #455: Correct lemmatization logic
Fix Issue #371: Make Lexeme objects hashable
Fix Issue #469: Make noun_chunks detect root NPs

👥 Contributors

Thanks to @daylen, @RahulKulhari, @stared, @adamhadani, @izeye and @crawfordcomeaux for the pull requests!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

✨ Major features and improvements

🔴 Bug fixes

Uh oh!

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

✨ Major features and improvements

⚠️ Backwards incompatibilities

🔴 Bug fixes

👥 Contributors

Uh oh!

Uh oh!

Releases: explosion/spaCy

v1.1.0: Bug fixes and adjustments

✨ Major features and improvements

🔴 Bug fixes

Uh oh!

v1.0.0: Support for deep learning workflows and entity-aware rule matcher

✨ Major features and improvements

⚠️ Backwards incompatibilities

🔴 Bug fixes

👥 Contributors

Uh oh!