-
Notifications
You must be signed in to change notification settings - Fork 0
emschorsch/nlp-midterm-project
Folders and files
| Name | Name | Last commit message | Last commit date | |
|---|---|---|---|---|
Repository files navigation
README file for the midterm project for cs65
Steve Dini and Emanuel Schorsch
==============================================
Contents of this directory:
i. trie.py
-Class implementation for the trie data structure. Supported external
methods include insert, build, successor and predecessor counts.
ii. counts.py written by Steve Dini
-basic implementation for word segmentation based on just successor and
predecessor counts as explained in the Harris paper.
iii. varieties.py written by Emanuel Schorsch
-contains the other implementations based on the Hafer paper. Implemented
methods include:
a) Reverse cutoff (k=14)
b) Reverse cutoff (k=22)
c) Duo cutoff (k1=2, k2=4)
d) Sum cutoff (k=22)
e) Duo Peaks
f) Sum Peaks
g) Negative Frequency
iv) dejean.py written by Steve Dini
-contains an implementation of Dejean's algorithm absent the contextual
segmentation described as the last step
v) stats.py
-helper module for getting values for cuts made, number of expected
correct cuts as well as the number of correct cuts actually made.
-Also has support for computing precision and recall
About
No description, website, or topics provided.
Resources
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published