Skip to content

Game metadata should be in the original language. #586

@trap15

Description

@trap15
Contributor

Using romanizations for game metadata is fairly lossy and at the very least it's poor documentation. I propose there should be a secondary title string for the original language, keeping the romanized version.

Ideally, the developer could also use UTF-8 to write it in, instead of using escaped unicode. For this, srcclean needs to stop horribly destroying unicode for no reason.

Activity

felipesanches

felipesanches commented on Jan 26, 2016

@felipesanches
Contributor

Some proposals:

This works, but it terribly unreadable.

GAME( 1992, wwmarine, 0, segac2, wwmarine, segac2_state, bloxeedc, ROT0, "Sega", "Waku Waku Marine \u308f\u304f\u308f\u304f\u30de\u30ea\u30f3", 0 )
GAME( 1991, soniccar, 0, segac2, soniccar, segac2_state, bloxeedc, ROT0, "Sega", "Waku Waku Sonic Patrol Car \u308f\u304f\u308f\u304f\u30bd\u30cb\u30c3\u30af\u30d1\u30c8\u30ab\u30fc", 0 )

This seems much better!

GAME( 1992, wwmarine, 0, segac2, wwmarine, segac2_state, bloxeedc, ROT0, "Sega", "Waku Waku Marine わくわくマリン", 0 )
GAME( 1991, soniccar, 0, segac2, soniccar, segac2_state, bloxeedc, ROT0, "Sega", "Waku Waku Sonic Patrol Car わくわくソニックパトカー", 0 )
trap15

trap15 commented on Jan 26, 2016

@trap15
ContributorAuthor

I'd like to put it in two different fields, so something like:

GAME( 1992, wwmarine, 0, segac2, wwmarine, segac2_state, bloxeedc, ROT0, "Sega", "Waku Waku Marine", "わくわくマリン", 0 )

Or for fields where they're the same, maybe something like

GAME( 1992, wwmarine, 0, segac2, wwmarine, segac2_state, bloxeedc, ROT0, "Sega", "Waku Waku Marine", "", 0 )
felipesanches

felipesanches commented on Jan 26, 2016

@felipesanches
Contributor

What about in the case of accented latin text?

Could it be like this?

COMP( 1972, patinho,  0,        0,      patinho_feio,  patinho_feio, patinho_feio_state, patinho_feio, "Escola Politécnica - Universidade de São Paulo", "", "Patinho Feio", "", MACHINE_NO_SOUND_HW | MACHINE_NOT_WORKING)

Bear in mind that both the "company" and the "full name" fields may have unicode characters...
We may have to use some macros to make it nicer and cleaner. I can understand the idea of using null strings but I'd preffer to not have them visible in the source at all.

angelosa

angelosa commented on Jan 26, 2016

@angelosa
Member

As things stands, I'm prone to think that the current MAME categorization system isn't in any way adeguate for 2016 standards.
Random examples:

  • There's no way to tell that sf2j is a Japanese version and sf2u is US other than the description tag. Might be a decent option for front-ends: "show European released games only" for example.
  • The parenthesis field can be automated, instead of having "Set 4", "Hong Kong bootleg" or "931101".
  • How many macros do we have for building systems anyway? Adding yet another optional field just for localized name and you'll get an even fancier macro-hell-scheme, which makes managing even harder.

Bottom line: I don't like software list / XML system either for personal tastes (namely being unreadable by human eye in raw format), but it certainly treats these "optional" things like non-romaji alphabets just well.

felipesanches

felipesanches commented on Jan 26, 2016

@felipesanches
Contributor

@angelosa Could you please open a new issue on GitHub specifically about the broader topic of improving the way MAME stores metadata in general? I think you've got valid points and I would add some more comments on that, but I'd preffer to keep this issue focused on the unicode strings and have all other discussion going on in a separate issue, for the sake of clarity and better organization of the current issues at hand.

trap15

trap15 commented on Jan 26, 2016

@trap15
ContributorAuthor

Could use compound literals?

GAME_ADD((GameInfo){
  .name="わくわくマリン"
  .name_romanized="Waku Waku Marine",
  .year=1992,
  [etc...]
})

In which case defaults end up being 0/NULL. More readable than XML, and keeps it in the driver.

felipesanches

felipesanches commented on Jan 26, 2016

@felipesanches
Contributor

The file src/mame/drivers/cps2.cpp has got almost 300 GAME entries... It would be good to keep all metadata in a single line if possible. But sometimes, indeed the lines get really long!

felipesanches

felipesanches commented on Jan 26, 2016

@felipesanches
Contributor

A tool called pyftsubset in the fonttools project (https://github.com/behdad/fonttools/) may be able to generate the needed Noto font subsetting that I suggested on IRC earlier today for unicode metadata strings in MAME.

Noto is a libre font family being developed by Google to have a very wide glyph coverage. So it is essentially a font designed to fulfill needs of ambitious multi-language projects like this.

But the problem is that such a font family has very large file sizes. So the idea is that we should generate a minimal font subset that contains only the glyphs needed. Before packaging a new MAME release, we would have to run an automatic subsetting script that would list all unicode codepoints of glyphs used in metadata strings declared in MAME's codebase and then the generated minimal font file would be added as a program resource and loaded by default in the MAME ui.

This would guarantee that all metadata would be properly rendered in our user interface.

felipesanches

felipesanches commented on Jan 26, 2016

@felipesanches
Contributor

oh! And by the way... here's the Noto libre font project website:
https://www.google.com/get/noto/

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

      Development

      No branches or pull requests

        Participants

        @felipesanches@trap15@angelosa

        Issue actions

          Game metadata should be in the original language. · Issue #586 · mamedev/mame