Optionally switch Identifiers form std::String to a different type in order to support string interning#30
Conversation
|
Hi. Thank you for making this pull request. Just out of curiosity, did you make any benchmarks to compare performance with and without an interner? I wonder what kind of impact this would have. Do you have an example how to plug an interner into this using the new trait? I think not having a receiver in I would be also fine just vendoring a simple interner, given it is justified in terms of performance. |
|
Hi, yes I was using a lasy_static RwCell global for the reciver, which isn't a very flexible approach. I now reworked my commit in such a way that the the parse function takes a interner by mutable reference that is then used to to intern the symbols. |
|
I did some quick tests with Feel free to do your own benchmarks and add results here. I suspect benefits may be larger for the code that actually makes use of the AST. But as it is now, to be honest, this doesn't seem to me like a big enough improvement to justify all the effort and the added complexity of making AST fully generic, and the extra work necessary to merge this (e.g. printer support). |
Hello,
as you probably are aware of, many compilers use string interning on identifier and type name tokens in order to avoid heap allocation and to allow for faster equality comparison between identifier names. Generally Strings are interned during tolkinization. Unfortunately lang-c currently only allows for identifiers and type names to be stored in
std::String. As I don't want to change the default behaviour, introduce dependencies on external crates or bless a particular string interning library, I came up with a solution where the ast and parser are converted into generic functions. The only issue with this solution is, that I somehow have to work aroundrust-peg's limitations. This is done by some string substitution on theparser.rs.raw. While I tried my best to modify the code in a fashion, that avoids breaking the public API, I cannot guarantee, whether I succeeded.I am fully aware, that this use of
rust-pegcould be considered a hack and you might be reluctant to merge this. Please fell free to close this pull request in this case. Given that you considered merging #18 and I wasn't certain, how this would effect my string substitution, I also made a version, where this change is applied, https://github.com/nacaclanga/lang-c/tree/generic_ident_conditional, which you could consider merging instead.Merry Christmas