Skip to content

Inflection 85 - Add Malayalam Inflection and Pronoun Tests #122

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
@@ -1,8 +1,10 @@
dictionary_da.lst filter=lfs diff=lfs merge=lfs -text
dictionary_en.lst filter=lfs diff=lfs merge=lfs -text
dictionary_es.lst filter=lfs diff=lfs merge=lfs -text
dictionary_ml.lst filter=lfs diff=lfs merge=lfs -text
inflectional_da.xml filter=lfs diff=lfs merge=lfs -text
inflectional_en.xml filter=lfs diff=lfs merge=lfs -text
inflectional_es.xml filter=lfs diff=lfs merge=lfs -text
inflectional_ml.xml filter=lfs diff=lfs merge=lfs -text
inflectional_sv.xml filter=lfs diff=lfs merge=lfs -text
dictionary_sv.lst filter=lfs diff=lfs merge=lfs -text
748,739 changes: 748,739 additions & 0 deletions inflection/resources/org/unicode/inflection/dictionary/dictionary_ml.lst

Large diffs are not rendered by default.

7,714 changes: 7,714 additions & 0 deletions inflection/resources/org/unicode/inflection/dictionary/inflectional_ml.xml

Large diffs are not rendered by default.

83 changes: 83 additions & 0 deletions inflection/resources/org/unicode/inflection/features/grammar.xml
Original file line number Diff line number Diff line change
Expand Up @@ -1624,6 +1624,89 @@
</category>
</grammar>
</language>
<language id="ml">
<grammar>
<category name="case">
<grammeme name="nominative"/> <!-- no explicit marker; subject form -->
<grammeme name="accusative"/> <!-- -യെ, -ായെ, marks direct object -->
<grammeme name="genitive"/> <!-- -ന്റെ, -യുടെ (possessive) -->
<grammeme name="dative"/> <!-- -ക്ക്, -ന് (to/for) -->
<grammeme name="instrumental"/> <!-- -ആല് (by means of) -->
<grammeme name="locative"/> <!-- -യില് (in/at) -->
<grammeme name="ablative"/> <!-- -യില് നിന്നു് (from) -->
<grammeme name="vocative"/> <!-- used in direct address -->
</category>
<category name="number">
<grammeme name="singular"/>
<grammeme name="plural"/>
</category>
<category name="person">
<restrictions>
<restriction name="pos" value="pronoun"/>
<restriction name="pos" value="verb"/>
</restrictions>
<grammeme name="first"/>
<grammeme name="second"/>
<grammeme name="third"/>
</category>
<category name="gender">
<restrictions>
<restriction name="pos" value="pronoun"/>
<restriction name="pos" value="verb"/>
</restrictions>
<grammeme name="masculine"/>
<grammeme name="feminine"/>
<grammeme name="neuter"/> <!-- e.g. for objects or animals -->
</category>
<category name="tense">
<restrictions>
<restriction name="pos" value="verb"/>
</restrictions>
<grammeme name="past"/>
<grammeme name="present"/>
<grammeme name="future"/>
</category>
<category name="mood">
<restrictions>
<restriction name="pos" value="verb"/>
</restrictions>
<grammeme name="indicative"/>
<grammeme name="imperative"/>
<grammeme name="subjunctive"/>
</category>
<category name="voice">
<restrictions>
<restriction name="pos" value="verb"/>
</restrictions>
<grammeme name="active"/>
<grammeme name="passive"/>
</category>
<category name="formality">
<restrictions>
<restriction name="pos" value="pronoun"/>
<restriction name="pos" value="verb"/>
</restrictions>
<grammeme name="intimate"/>
<grammeme name="casual"/>
<grammeme name="formal"/>
<grammeme name="honorific"/>
</category>
<category name="aspect">
<restrictions>
<restriction name="pos" value="verb"/>
</restrictions>
<grammeme name="perfective"/>
<grammeme name="imperfective"/>
</category>
<category name="negation">
<restrictions>
<restriction name="pos" value="verb"/>
</restrictions>
<grammeme name="affirmative"/>
<grammeme name="negative"/>
</category>
</grammar>
</language>
<language id="ms">
<grammar>
<category name="clusivity">
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,7 @@ locale.group.it=it_IT,it_CH
locale.group.ja=ja_JP
locale.group.ko=ko_KR
locale.group.ms=ms_MY
locale.group.ml=ml_IN
locale.group.nb=nb_NO
locale.group.nl=nl_NL,nl_BE
locale.group.pt=pt_BR,pt_PT
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
#
# Copyright 2025 Unicode Incorporated and others. All rights reserved.
#
tokenizer.implementation.class=DefaultTokenizer

12 changes: 12 additions & 0 deletions inflection/src/inflection/util/LocaleUtils.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -407,6 +407,18 @@ const ULocale& LocaleUtils::MALAYSIA()
return *npc(MALAYSIA_);
}

const ULocale& LocaleUtils::MALAYALAM()
{
static auto MALAYALAM_ = new ULocale("ml");
return *npc(MALAYALAM_);
}

const ULocale& LocaleUtils::INDIA_MALAYALAM()
{
static auto INDIA_MALAYALAM_ = new ULocale("ml", "IN");
return *npc(INDIA_MALAYALAM_);
}

const ULocale& LocaleUtils::NORWEGIAN()
{
static auto NORWEGIAN_ = new ULocale("nb");
Expand Down
8 changes: 8 additions & 0 deletions inflection/src/inflection/util/LocaleUtils.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -376,6 +376,14 @@ class INFLECTION_CLASS_API inflection::util::LocaleUtils final
* ms_MY: Malay (Malaysia)
*/
static const ::inflection::util::ULocale& MALAYSIA();
/**
* ml: Malayalam
*/
static const ::inflection::util::ULocale& MALAYALAM();
/**
* ml_IN: Malayalam (India)
*/
static const ::inflection::util::ULocale& INDIA_MALAYALAM();
/**
* nb: Norwegian Bokmål
*/
Expand Down
104 changes: 104 additions & 0 deletions inflection/test/resources/inflection/dialog/inflection/ml.xml
Original file line number Diff line number Diff line change
@@ -0,0 +1,104 @@
<?xml version='1.0' encoding='utf-8'?>
<!--
Copyright 2025 Unicode Incorporated and others. All rights reserved.
-->
<inflectionTest locale="ml">
<!-- Malayalam Pronoun Inflection Tests -->
<test><source person="first" number="singular">ഞാൻ</source><result>ഞാൻ</result></test>
<test><source person="first" number="plural">ഞങ്ങൾ</source><result>ഞങ്ങൾ</result></test>
<test><source person="second" number="singular" gender="masculine">നീ</source><result>നീ</result></test>
<test><source person="second" number="plural">നിങ്ങൾ</source><result>നിങ്ങൾ</result></test>
<test><source person="third" gender="feminine">അവൾ</source><result gender="feminine">അവൾ</result></test>
<test><source person="third" gender="masculine">അവൻ</source><result gender="masculine">അവൻ</result></test>
<test><source person="third" gender="neuter">അത്</source><result gender="neuter">അത്</result></test>

<!-- Malayalam Singular Plural Inflection Tests -->
<test><source number="plural">കുട്ടി</source><result>കുട്ടികൾ</result></test>
<test><source number="singular">പുസ്തകങ്ങൾ</source><result>പുസ്തകം</result></test>
<test><source number="plural">മരം</source><result>മരങ്ങൾ</result></test>
<test><source number="singular">കഥകൾ</source><result>കഥ</result></test>

<!-- Malayalam Gender Inflection Tests (for adjectives) -->
<test><source gender="feminine">നല്ല</source><result exists="true">നല്ല</result></test>
<test><source gender="masculine">നല്ല</source><result exists="true">നല്ല</result></test>

<!-- Malayalam Verb Inflection Tests -->
<test><source tense="present" person="first" number="singular">ചോദിക്കുക</source><result>ചോദിക്കുന്നു</result></test>
<test><source tense="past" person="first" number="singular">ചോദിക്കുക</source><result>ചോദിച്ചു</result></test>
<test><source tense="future" person="first" number="singular">ചോദിക്കുക</source><result>ചോദിക്കും</result></test>
<test><source tense="present" person="third" number="plural">വരിക</source><result>വരുന്നു</result></test>

<!-- Malayalam Case Inflection Tests -->
<test><source case="nominative">കുട്ടി</source><result>കുട്ടി</result></test>
<test><source case="accusative">കുട്ടി</source><result>കുട്ടിയെ</result></test>
<test><source case="dative">കുട്ടി</source><result>കുട്ടിക്ക്</result></test>
<test><source case="genitive">കുട്ടി</source><result>കുട്ടിയുടെ</result></test>
<test><source case="locative">കുട്ടി</source><result>കുട്ടിയില്</result></test>
<test><source case="instrumental">കുട്ടി</source><result>കുട്ടിയാല്</result></test>

<!-- Malayalam Reinflection Tests (no change if not modified) -->
<test><source>അവൻ</source><result>അവൻ</result></test>
<test><source>അവൾ</source><result>അവൾ</result></test>

<!-- Malayalam Adjective Inflection Tests -->
<test><source gender="feminine">പുതിയ</source><result>പുതിയ</result></test>
<test><source gender="masculine">പുതിയ</source><result>പുതിയ</result></test>

<!-- Singular Plural Lookup Tests (no result because it’s a lookup) -->
<test><source>മരം</source><result number="singular"/></test>
<test><source>മരങ്ങൾ</source><result number="plural"/></test>

<!-- Gender Lookup Tests (no result because it’s a lookup) -->
<test><source>അവൾ</source><result gender="feminine"/></test>
<test><source>അവൻ</source><result gender="masculine"/></test>
<test><source>അത്</source><result gender="neuter"/></test>

<!-- Additional Madeup/Latin Words Reinflection Test -->
<test><source number="singular">സൂപ്പർ</source><result>സൂപ്പർ</result></test>
<test><source number="plural">ഫേസ്ബുക്ക്</source><result>ഫേസ്ബുക്കുകൾ</result></test>

<!-- Malayalam Multi-word Noun Phrase Number Detection -->
<test><source>വളപ്പുറത്തെ ലൈറ്റ്</source><result number="singular"/></test>
<test><source>വളപ്പുറത്തെ ലൈറ്റുകൾ</source><result number="plural"/></test>
<test><source>തോട്ടത്തിലെ ലൈറ്റുകൾ</source><result number="plural"/></test>

<!-- Malayalam Verb Number Inflection Contrast -->
<test><source tense="present" person="third" number="singular">വരിക</source><result>വരുന്നു</result></test>
<test><source tense="past" person="third" number="plural">വരിക</source><result>വന്നു</result></test>
<test><source tense="future" person="third" number="plural">ചോദിക്കുക</source><result>ചോദിക്കും</result></test>

<!-- Malayalam Verb Inflection with pos="verb" -->
<test><source number="singular" pos="verb">ചോദിക്കുന്നു</source><result>ചോദിക്കുന്നു</result></test>
<test><source number="plural" pos="verb">ചോദിക്കുന്നു</source><result>ചോദിക്കുന്നു</result></test>
<test><source number="singular" pos="verb">വരുന്നു</source><result>വരുന്നു</result></test>
<test><source number="plural" pos="verb">വരുന്നു</source><result>വരുന്നു</result></test>
<test><source number="singular" pos="verb">ആകുന്നു</source><result>ആകുന്നു</result></test>
<test><source number="plural" pos="verb">ആകുന്നു</source><result>ആകുന്നു</result></test>

<!-- Malayalam Noun Inflection for Units (as in "one kilometer to two kilometers") -->
<test><source number="singular">കിലോമീറ്റർ</source><result>കിലോമീറ്റർ</result></test>
<test><source number="plural">കിലോമീറ്റർ</source><result>കിലോമീറ്ററുകൾ</result></test>

<!-- Additional common plural forms -->
<test><source number="plural">കപ്പ്</source><result>കപ്പുകൾ</result></test>
<test><source number="plural">പൂച്ച</source><result>പൂച്ചകൾ</result></test>

<!-- Additional Malayalam Subject-Verb Agreement Examples -->
<test><source>അവൻ ഓടുന്നു</source><result number="singular" gender="masculine"/></test>
<test><source>അവൾ ഓടുന്നു</source><result number="singular" gender="feminine"/></test>

<!-- Additional Malayalam Pronoun Case Inflections -->
<test><source case="accusative">അവൻ</source><result>അവനെ</result></test>
<test><source case="accusative">അവൾ</source><result>അവളെ</result></test>
<test><source case="dative">അവൻ</source><result>അവന്</result></test>

<!-- More Multi-word Noun Phrase Number Lookup -->
<test><source>ക്യാമ്പസ് ലൈറ്റ്</source><result number="singular"/></test>
<test><source>ക്യാമ്പസ് ലൈറ്റുകൾ</source><result number="plural"/></test>

<!-- Additional Noun/Verb Inflection Tests -->
<test><source>ക്യാമ്പസ് ലൈറ്റ്</source><result number="singular"/></test>
<test><source>ക്യാമ്പസ് ലൈറ്റുകൾ</source><result number="plural"/></test>
<test><source>തോട്ടത്തിലെ ലൈറ്റുകൾ</source><result number="plural"/></test>

</inflectionTest>
59 changes: 59 additions & 0 deletions inflection/test/resources/inflection/dialog/pronoun/ml.xml
Original file line number Diff line number Diff line change
@@ -0,0 +1,59 @@
<?xml version='1.0' encoding='utf-8'?>
<!--
Copyright 2025 Unicode Incorporated and others. All rights reserved.
-->
<inflectionTest locale="ml">
<!-- Simple inflection -->
<test><source/><result>അവൻ</result></test> <!-- Default 3rd person singular masculine -->

<!-- First person singular -->
<test><source person="first" number="singular" case="nominative"/><result>ഞാൻ</result></test>
<test><source person="first" number="singular" case="accusative"/><result>എനിക്ക്</result></test>
<test><source person="first" number="singular" case="genitive"/><result>എന്റെ</result></test>

<!-- First person plural -->
<test><source person="first" number="plural" case="nominative" clusivity="inclusive"/><result>നാം</result></test>
<test><source person="first" number="plural" case="nominative" clusivity="exclusive"/><result>ഞങ്ങൾ</result></test>
<test><source person="first" number="plural" case="accusative" clusivity="inclusive"/><result>നമുക്ക്</result></test>
<test><source person="first" number="plural" case="accusative" clusivity="exclusive"/><result>ഞങ്ങൾക്ക്</result></test>
<test><source person="first" number="plural" case="genitive" clusivity="inclusive"/><result>നമ്മുടെ</result></test>
<test><source person="first" number="plural" case="genitive" clusivity="exclusive"/><result>ഞങ്ങളുടെ</result></test>

<!-- Second person singular -->
<test><source person="second" number="singular" case="nominative" register="informal"/><result>നീ</result></test>
<test><source person="second" number="singular" case="nominative" register="formal"/><result>താങ്കൾ</result></test>
<test><source person="second" number="singular" case="accusative" register="informal"/><result>നിനക്ക്</result></test>
<test><source person="second" number="singular" case="accusative" register="formal"/><result>താങ്കൾക്ക്</result></test>
<test><source person="second" number="singular" case="genitive" register="informal"/><result>നിന്റെ</result></test>
<test><source person="second" number="singular" case="genitive" register="formal"/><result>താങ്കളുടെ</result></test>

<!-- Second person plural -->
<test><source person="second" number="plural" case="nominative"/><result>നിങ്ങൾ</result></test>
<test><source person="second" number="plural" case="accusative"/><result>നിങ്ങൾക്ക്</result></test>
<test><source person="second" number="plural" case="genitive"/><result>നിങ്ങളുടെ</result></test>

<!-- Third person singular masculine -->
<test><source person="third" number="singular" case="nominative" gender="masculine"/><result>അവൻ</result></test>
<test><source person="third" number="singular" case="accusative" gender="masculine"/><result>അവനെ</result></test>
<test><source person="third" number="singular" case="genitive" gender="masculine"/><result>അവന്റെ</result></test>

<!-- Third person singular feminine -->
<test><source person="third" number="singular" case="nominative" gender="feminine"/><result>അവൾ</result></test>
<test><source person="third" number="singular" case="accusative" gender="feminine"/><result>അവളെ</result></test>
<test><source person="third" number="singular" case="genitive" gender="feminine"/><result>അവളുടെ</result></test>

<!-- Third person singular neuter (it/that) -->
<test><source person="third" number="singular" case="nominative" gender="neuter"/><result>അത്</result></test>
<test><source person="third" number="singular" case="accusative" gender="neuter"/><result>അത്</result></test>
<test><source person="third" number="singular" case="genitive" gender="neuter"/><result>അതിന്റേത്</result></test>

<!-- Third person plural -->
<test><source person="third" number="plural" case="nominative"/><result>അവർ</result></test>
<test><source person="third" number="plural" case="accusative"/><result>അവരെ</result></test>
<test><source person="third" number="plural" case="genitive"/><result>അവരുടെ</result></test>

<!-- Reinflexion examples -->
<test><source person="third">ഞാൻ</source><result gender="masculine">അവൻ</result></test>
<test><source person="third" gender="feminine">ഞാൻ</source><result>അവൾ</result></test>
<test><source person="second" register="formal">ഞാൻ</source><result>താങ്കൾ</result></test>
</inflectionTest>
2 changes: 2 additions & 0 deletions inflection/test/src/inflection/util/LocaleUtilsTest.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -93,6 +93,7 @@ TEST_CASE("LocaleUtilsTest#testCoverage")
inflection::util::LocaleUtils::KOREAN(),
inflection::util::LocaleUtils::LITHUANIAN(),
inflection::util::LocaleUtils::MALAY(),
inflection::util::LocaleUtils::MALAYALAM(),
inflection::util::LocaleUtils::NORWEGIAN(),
inflection::util::LocaleUtils::DUTCH(),
inflection::util::LocaleUtils::POLISH(),
Expand Down Expand Up @@ -139,6 +140,7 @@ TEST_CASE("LocaleUtilsTest#testCoverage")
inflection::util::LocaleUtils::FRANCE(),
inflection::util::LocaleUtils::SWITZERLAND_FRENCH(),
inflection::util::LocaleUtils::INDIA_HINDI(),
inflection::util::LocaleUtils::INDIA_MALAYALAM(),
inflection::util::LocaleUtils::CROATIA(),
inflection::util::LocaleUtils::ISRAEL(),
inflection::util::LocaleUtils::HUNGARY(),
Expand Down