FeiRan-

Quantitative based Chinese text recognition and visualization

Perform text recognition on a piece of text and visualize some data. Based on previous research on literary style, we have constructed a Chinese text literary style feature system. From the four language dimensions of words, phrases, sentences, and discourse, we summarize the expression of literary style into six literary style dimensions: color beauty, sound beauty, decoration beauty, emotional beauty, image beauty, and philosophical beauty. We have set 185 specific measurement indicators to calculate the literary style of the text, and finally obtained the level of literary style (divided into three levels: 1, 2, and 3) and visualized some measurement indicators.

Due to the different specific forms of expression of texts in different language styles, in order to establish a unified standard, we only choose literary and artistic language style analysis and mainly select famous prose as the representative corpus. Any text that has outstanding performance in sound, decoration, color, emotion, image, or philosophy is considered a literary fragment.

Based on the above feature system and measurement indicators, we use machine learning methods to test the effectiveness of various language features in the feature system. We use support vector machines as classifiers to construct the model, and use accuracy, recall, and precision as indicators to measure the validity of the model. We calculated the classification performance of language features in each dimension separately; Then, through random forest screening, the final optimal model is formed.

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
__pycache__		__pycache__
corpus		corpus
output		output
resource		resource
submit		submit
ProcessFeat.py		ProcessFeat.py
README.md		README.md
featanalyzeall.py		featanalyzeall.py
feature.py		feature.py
hs_err_pid24832.log		hs_err_pid24832.log
parser_ly.py		parser_ly.py
readme.txt		readme.txt
testmodel.py		testmodel.py
utils.py		utils.py
wencai_model.m		wencai_model.m

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

FeiRan-

About

Uh oh!

Releases

Packages

Languages

QIURUIMIN/FeiRan-

Folders and files

Latest commit

History

Repository files navigation

FeiRan-

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages