Skip to content

Conversation

@tmfink
Copy link
Contributor

@tmfink tmfink commented Sep 16, 2020

Work-in-progress to handle #3.

Please let me know your thoughts on this style. I just worked on katakana_to_hiragana(). Once we figure out how the best way to do this, I will work on the other modules.

Add iterator input APIs to

  • is_hiragana
    • is_hiragana()
  • is_japanese
    • is_japanese()
  • is_kana
    • is_kana()
  • is_kanji
    • contains_kanji()
    • is_kanji()
  • is_katakana
    • is_katakana()
  • is_mixed
    • is_mixed()
    • is_mixed_pass_kanji()
  • is_romaji
    • is_romaji()
  • to_hiragana
    • to_hiragana()
    • to_hiragana_with_opt()
  • to_kana
    • to_kana()
    • to_kana_with_opt()
  • to_katakana
    • to_katakana()
    • to_katakana_with_opt()
  • to_romaji
    • to_romaji()
    • to_romaji_with_opt()
  • tokenize
    • tokenize()
    • tokenize_detailed()
    • tokenize_with_opt()
  • trim_okurigana
    • is_invalid_matcher()
    • is_leading_without_initial_kana()
    • is_trailing_without_final_kana()
    • trim_okurigana()
    • trim_okurigana_with_opt()
  • utils
    • hiragana_to_katakana()
    • katakana_to_hiragana()
    • romaji_to_hiragana()

Pull out body of katakana_to_hiragana_with_opt() into separate function
that takes a char iterator as input.
@PSeitz
Copy link
Owner

PSeitz commented Sep 16, 2020

There is a place where the next char is peeked:

} else if is_romaji(input) || input.chars().next().map(|c| is_char_english_punctuation(c)).unwrap_or(false) {
// TODO: is it correct to check only the first char (see src\utils\isCharEnglishPunctuation.js)
romaji_to_hiragana(input, config)

So it would require a peekable iterator I think
https://doc.rust-lang.org/std/iter/struct.Peekable.html

There may be other similar cases.

@PSeitz
Copy link
Owner

PSeitz commented Sep 30, 2020

On second thought, we could always just use the minimal api instead of a uniform one, which would not require Peekable here. What do you think?

@tmfink
Copy link
Contributor Author

tmfink commented Oct 1, 2020

On second thought, we could always just use the minimal api instead of a uniform one, which would not require Peekable here. What do you think?

@PSeitz What do you mean by minimal vs. uniform API?

In some cases, we can get away with avoiding the "direct indexing" approach. For katakana_iter_to_hiragana_with_opt(), I added a previous_char variable to track the previous character (instead of indexing into the previous index).

@PSeitz
Copy link
Owner

PSeitz commented Oct 1, 2020

@PSeitz What do you mean by minimal vs. uniform API?

I mean defining a single iterator api which is used by all methods, vs minimal api everywhere.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants