-
-
Notifications
You must be signed in to change notification settings - Fork 1.1k
Reduced some allocations in QRCodeGenerator
(NETCORE_APP only)
#595
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
@@ -1017,8 +1018,15 @@ private static Polynom MultiplyAlphaPolynoms(Polynom polynomBase, Polynom polyno | |||
} | |||
|
|||
// Identify and merge terms with the same exponent. | |||
#if NETCOREAPP | |||
var toGlue = GetNotUniqueExponents(resultPolynom, resultPolynom.Count <= 128 ? stackalloc int[128].Slice(0, resultPolynom.Count) : new int[resultPolynom.Count]); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately I don't know too little about the QR-specs, but the resultPolynom.Count <= 128
could be avoided if by spec the count can't be as high. Or we can change the threshold to a higher value.
If the count can't be as high, then the fallback to the array allocation could also be removed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When running the tests, it doesn't go over 64. Seems fine to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My tests showed the same (what a wonder 😉).
But in that regard I'm a bit paranoid...in the sense wenn the fallback is removed what will happen when a bigger buffer is needed (maybe now by some special inputs or anytime in the future when new additions may be done)? This would result in an exception. W/ the fallback there's a safety net.
Except it can be proven via the QR-spec that the count never will exceed a certain threshold (but does that hold in the future too?)
If the non-stackalloc path would be very frequent, then renting from the array-pool would be an option, but here it's a assumed to be rare / never taken path.
Would you still remove the fallback or let's just leave it as is?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh I’d definitely leave the fallback. There’s just no reason to remove it. The JIT will realize it’s not a common pathway and optimize it accordingly. But as you say, we neither of us know enough about how these polynomials work to know if it will always be < 128.
if (toGlue.Contains(resultPolynom[i].Exponent)) | ||
#else | ||
if (Array.IndexOf(toGlue, resultPolynom[i].Exponent) >= 0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Before Linq was used w/ it's generic Contains
method.
- for AOT Linq requires way more code
- it's an interface dispatch that isn't needed
Note: for .NET (Core) the span-based Contains
is used.
@@ -1046,20 +1058,55 @@ private static Polynom MultiplyAlphaPolynoms(Polynom polynomBase, Polynom polyno | |||
return resultPolynom; | |||
|
|||
// Auxiliary function to identify exponents that appear more than once in the polynomial. | |||
int[] GetNotUniqueExponents(Polynom list) | |||
#if NETCOREAPP | |||
static ReadOnlySpan<int> GetNotUniqueExponents(Polynom list, Span<int> buffer) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This implementation doesn't need a Dictionary<,>
to determine the non unique exponents.
It works as follows:
- a scratch buffer of the same size as the list is passed in
- exponents are written / copied to that scratch buffer
- scratch buffer is sorted, thus the exponents are in order
- for each item in the scratch buffer (= ordered exponents) it's compared w/ the previous one
- if equal, then increment a counter
- else check if the counter is
$>0$ and if so write the exponent to the result
For writing the result the same scratch buffer is used, as by definition the index to write the result is <=
the iteration index, so no overlap, etc. can occur.
That way we avoid the need for a second scratch buffer.
Should someting like this be added as comment or is it cleare enough how it works?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The function isn't long; so long as the method comment is descriptive enough, I think it's fine.
{ | ||
var dic = new Dictionary<int, bool>(list.Count); | ||
Debug.Assert(list.Count == buffer.Length); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you plan to leave the Debug.Assert
code in here? Just wondering; it will get removed from release builds anyway.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like to have some Debug.Assert
s
- they check some invariants in debug builds and while running tests
- are a bit of self-documenting the intention (so why write a comment, when the assert can also be done?)
Especially here the buffer
is passed in as argument and must have the correct size.
If the size doesn't match, then the tests (under debug) will fail and one knows why. Otherwise it may be hard to track down the bug.
So I'd leave them in the code.
As you said: for !DEBUG these asserts won't have any effect.
I wouldn't. There has been so many performance enhancements in .NET Core since .NET Framework, that if anyone wants better performance, they should use .NET Core. And I'm sure the fallback performance is plenty good enough anyway. |
The reduction in allocations is very impressive! |
I fixed the CI scripts in #592 btw |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me! (subject to tests passing, of course)
Honestly yes… |
Profiling showed that there are some allocations in
QRCodeGenerator
that can be quite easily avoided.simple console app for profiling
Allocations are removed / avoided for:
Dictionary<,>.Entry[]
QRCodeGenerator.PolynomItem[]
Dictionary<,>
The change is done only for .NET (Core) targets, as
Span<T>
is used.By adding a reference to System.Memory package this change could also be done for .NET Desktop.
Profile
Before
After
Benchmarks
Before
After