-
Notifications
You must be signed in to change notification settings - Fork 11
Flatten bytes class hierarchy #40
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
All contributors have signed the CLA ✍️ ✅ |
|
I have read the CLA Document and I hereby sign the CLA |
|
recheck |
81cb3e2 to
46f5de9
Compare
0625d27 to
2dbda56
Compare
1b4e159 to
8a24b0e
Compare
|
After this comment I will make this PR ready to be pulled in without stopping any development on Tuweni v1 as I'm leaving the current I will squash all changes so far and will create an entire new package |
8a24b0e to
1c641ff
Compare
|
recheck |
cc29c50 to
dff599a
Compare
macfarla
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we need to add a consensys copyright statement? Are both v1 and v2 tests running?
bytes/src/main/java/org/apache/tuweni/v2/bytes/ArrayWrappingBytes.java
Outdated
Show resolved
Hide resolved
bytes/src/main/java/org/apache/tuweni/v2/bytes/ArrayWrappingBytes.java
Outdated
Show resolved
Hide resolved
|
This PR was marked stale due to lack of activity. It will be closed in 14 days. |
bytes/src/jmh/java/org/benchmark/BytesMegamorphicBenchmarkV1.java
Outdated
Show resolved
Hide resolved
bytes/src/jmh/java/org/benchmark/BytesMegamorphicBenchmarkV2.java
Outdated
Show resolved
Hide resolved
| if (offset == 0 && length == bytes.length) { | ||
| return bytes; | ||
| } | ||
| return Arrays.copyOfRange(bytes, offset, offset + length); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand to toArray() is removed, correct ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I'm not convinced it's needed. At least not needed for tuweni or discovery so far
|
|
||
| @Override | ||
| protected void and(byte[] bytesArray, int offset, int length) { | ||
| Utils.and(this.bytes, this.offset, bytesArray, offset, length); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be nice to have jmh benchmarks on and other binary operations below. I think it will show a pretty good improvement. This is not a blocking comment, but a nice to have.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's do that in another PR
| * <p>This class may be used to create more types that represent bytes, but need a different name | ||
| * for business logic. | ||
| */ | ||
| public class DelegatingBytes extends Bytes { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shiftLeft and shiftRight doesn't exist anymore in this implementation, is that intended ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, I have moved all to the MutableBytes. Now can do multiple operations on Bytes and avoid copying and we are explicit that we are creating copies
| protected void xor(byte[] bytesArray, int offset, int length) { | ||
| Utils.xor(this.bytes, this.offset, bytesArray, offset, length); | ||
| } | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does it make sens to add shiftLeft and shiftRight here ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same as #40 (comment)
| import java.util.Random; | ||
|
|
||
| /** A {@link Bytes} value that is guaranteed to contain exactly 32 bytes. */ | ||
| public final class Bytes32 extends DelegatingBytes { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suggest to rename this class DelegatingBytes32.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why though? That's an implementation detail
| import java.util.Random; | ||
|
|
||
| /** A {@link Bytes} value that is guaranteed to contain exactly 48 bytes. */ | ||
| public final class Bytes48 extends DelegatingBytes { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suggest renaming this class DelegatingBytes48.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same as #40 (comment)
| return bytes48; | ||
| } | ||
| checkArgument(slice.size() == SIZE, "Expected %s bytes but got %s", SIZE, slice.size()); | ||
| return new Bytes48(slice.getImpl()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why mot just new Bytes48(slice) ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see, because of DelegatingBytes. I wonder if is not simpler to have another overloaded method with DelegatingBytes as a parameter and delete getImpl() in Bytes.java.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correct, because of DelegatingBytes so we need to get straight access to the implementation to avoid stacked method call that just loops.
Not sure I understand the suggestion - do you mean casting Bytes to DelegatingBytes ?
bytes/src/main/java/org/apache/tuweni/v2/bytes/ConcatenatedBytes.java
Outdated
Show resolved
Hide resolved
| for (Bytes value : values) { | ||
| int size = value.size(); | ||
| try { | ||
| totalSize = Math.addExact(totalSize, size); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of throwing and then catching the exception, I suggest replacing with the code of Math.addExact :
int result = totalSize + size;
// HD 2-12 Overflow iff both arguments have the opposite sign of the result
if (((totalSize ^ result) & (size ^ result)) < 0) {
throw new IllegalArgumentException(
"Combined length of values is too long (> Integer.MAX_VALUE)");
}
totalSize = result;
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
but are you sure the method is not going to be inlined anyway? I would suggest creating a benchmark to prove that
|
I can confirm that the V2 implementations has better performance for megamorphic use case and similar performances for the non megamorphic one |
| public static Bytes keccak256(Bytes input) { | ||
| try { | ||
| return (Bytes32) digestUsingAlgorithm(input, KECCAK_256); | ||
| return digestUsingAlgorithm(input, KECCAK_256); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this case, we're loosing the information that the returned Bytes from keccak256 is 32 bytes. I wonder if there're edge cases to this, as Bytes can store more than 32 bytes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree you loose the developer friendliness. Though for performance might be justified. Anyway, we are leaving both versions of the code in anyways so we have time to decide after this PR, IMO
| public static Bytes sha3_256(Bytes input) { | ||
| try { | ||
| return (Bytes32) digestUsingAlgorithm(input, SHA3_256); | ||
| return digestUsingAlgorithm(input, SHA3_256); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The same as for keccak256
|
This PR was marked stale due to lack of activity. It will be closed in 14 days. |
|
This PR was marked stale due to lack of activity. It will be closed in 14 days. |
|
This PR was marked stale due to lack of activity. It will be closed in 14 days. |
|
Closed as inactive. Feel free to reopen if this PR is still being worked on. |
| result[i] = this.ints[i] & other; | ||
| } | ||
| return new UInt256(result); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bug: Bitwise Operation Errors in UInt Classes
The and, or, and xor methods in both UInt256 and UInt384 classes contain a byte indexing error when operating on Bytes or Bytes32 parameters. The byte offset calculation incorrectly uses the ints array index (i) instead of the bytes parameter index (j), leading to incorrect bitwise results and potential IndexOutOfBoundsException.
Additional Locations (4)
units/src/main/java/org/apache/tuweni/v2/units/bigints/UInt384.java#L457-L464units/src/main/java/org/apache/tuweni/v2/units/bigints/UInt384.java#L487-L494units/src/main/java/org/apache/tuweni/v2/units/bigints/UInt384.java#L519-L526units/src/main/java/org/apache/tuweni/v2/units/bigints/UInt256.java#L637-L644
| fun create(keyPair: SECP256K1.KeyPair, now: Long): Packet { | ||
| val expiration = expirationFor(now) | ||
| val sigHash = createSignature( | ||
| PacketType.ENRRESPONSE, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| try { | ||
| SHA256Hash.Hash result = SHA256Hash.hash(shaInput); | ||
| try { | ||
| return SHA256Hash.hash(shaInput).bytes(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bug: Double Hashing Causes Memory Leak
The sha2_256 methods (both byte[] and Bytes overloads) compute the hash twice when Sodium is available. The first hash object is stored and destroyed, but a new hash object is created and returned from a second computation, which is then leaked.
Additional Locations (1)
|
This PR was marked stale due to lack of activity. It will be closed in 14 days. |
| checkElementIndex(offset, bytes.length); | ||
| } | ||
| checkLength(bytes, offset); | ||
| return new ArrayWrappingBytes(bytes, offset, bytes.length); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bug: Array Wrapper Overruns Bounds: Incorrect Length
The fromArray method creates an ArrayWrappingBytes with bytes.length as the length parameter, which attempts to wrap more bytes than available when offset > 0. This causes the wrapper to extend beyond the array bounds. The length should be SIZE (32 bytes) to correctly wrap exactly 32 bytes starting from the offset, matching the method's documented behavior and the validation performed by checkLength.
| @Override | ||
| public void appendTo(ByteBuffer byteBuffer) { | ||
| byteBuffer.put(this.byteBuffer); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bug: Append Ignores Slice Boundaries
The appendTo method puts the entire byteBuffer into the target buffer, ignoring the offset and size() of this wrapper. When a ByteBufferWrappingBytes represents a slice (constructed with offset and length), this appends the wrong bytes. The method transfers from the buffer's current position/limit rather than the intended slice range defined by offset and size().
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO it's not a bug
ad67e19 to
a4def2a
Compare
| private static final int SIZE = 32; | ||
|
|
||
| /** A {@code Bytes32} containing all zero bytes */ | ||
| public static final Bytes ZERO = fromByte((byte) 0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bug: Type Mismatch: ZERO Constant Threatens Integrity
The ZERO constant is declared as type Bytes instead of Bytes32, causing a type inconsistency. The javadoc states it's a Bytes32 containing all zero bytes, but the actual type doesn't match. This breaks type safety and could cause issues when code expects a Bytes32 instance but receives a generic Bytes reference.
| public static final int SIZE = 48; | ||
|
|
||
| /** A {@code Bytes48} containing all zero bytes */ | ||
| public static final Bytes ZERO = fromByte((byte) 0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bug: ZERO Constant: Type Inconsistency Threatens Safety
The ZERO constant is declared as type Bytes instead of Bytes48, causing a type inconsistency. The javadoc states it's a Bytes48 containing all zero bytes, but the actual type doesn't match. This breaks type safety and could cause issues when code expects a Bytes48 instance but receives a generic Bytes reference.
| byte[] newBytesArray = new byte[otherSize]; | ||
| System.arraycopy(bytesArray, 0, newBytesArray, otherSize - size, size); | ||
| bytesArray = newBytesArray; | ||
| size = otherSize; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bug: Immutability Contract Violated: Stale Hashcodes
The and, or, and xor methods mutate the size field after object construction when resizing the internal array. Since size is inherited from the immutable parent Bytes class and used for hashcode caching, this mutation breaks the immutability contract and can cause cached hashcodes to become stale, leading to incorrect behavior in hash-based collections.
| * | ||
| * @return The number of bytes this value represents. | ||
| */ | ||
| public int size() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
make this final
| return delegate.trimLeadingZeros(); | ||
| } | ||
|
|
||
| @Override |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bug: DelegatingBytes missing trimTrailingZeros override
The DelegatingBytes class overrides trimLeadingZeros() to delegate to its delegate (line 206-208), but is missing an override for trimTrailingZeros(). This means calls to trimTrailingZeros() will use the base Bytes implementation instead of delegating to the wrapped delegate, breaking the delegation pattern and potentially causing inconsistent behavior compared to other delegated methods.
This is a collapsed view of the previous old commits: * 0929d61 Merge remote-tracking branch 'upstream/main' into flatten-Bytes-class-hierarchy * | 6d79491 Miscelaneous fixes * | a4def2a Fix memleak in Hash and bug in devp2p Packet creation * | 10e6e81 fix bug in and/or/xor operations with UInt256/UInt384 * | 578e6bb Move bound checks for Bytes implementation out of constructors * | b9c6b05 Merge branch 'main' into flatten-Bytes-class-hierarchy * 4100d8b review feedback * afba2b9 change megamorphic benchmarks to slice and optimize size() * b2066fc JMH integration fixes * e40d796 remove InterfaceCall and VirtualCall benchmarks * dff599a fix ConnectTwoServersTest - missed package changes and port bindings * 5aba562 add missing package-info.java files * 1c641ff fix failure in DefaultDiscoveryV5ServiceTest about reusing ports * 279acd6 bring in old tuweni v1 after move * ace7910 move all other touched files to org.apache.tuweni.v2 package * 65f217b mv org.apache.tuweni.bytes.v2 to org.apache.tuweni.v2.bytes * ebaa8bd Implement Tuweni v2 Signed-off-by: Luis Pinto <[email protected]>
Signed-off-by: Luis Pinto <[email protected]>
This is for easy reviewing by making the PR much shorter. Signed-off-by: Luis Pinto <[email protected]>
cddbcca to
a646271
Compare
|
I reverted changes made to other projects than bytes in order to make it more reviewable. Will issue separate PR's for other projects |
| @Override | ||
| public void appendTo(ByteBuffer byteBuffer) { | ||
| byteBuffer.put(this.byteBuffer); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bug: appendTo ignores offset and size in ByteBuffer wrapper
The appendTo(ByteBuffer) method in ByteBufferWrappingBytes ignores the offset and size fields, putting the entire source ByteBuffer instead of just the wrapped slice. All other methods in this class (like get, getInt, getLong, and, or, xor, equals, computeHashcode, toArray) correctly use offset + i when accessing data. However, appendTo just calls byteBuffer.put(this.byteBuffer) which will write from the source buffer's position to its limit, disregarding the stored offset entirely. This contrasts with BufferWrappingBytes and ByteBufWrappingBytes which slice their buffers in the constructor, and with ArrayWrappingBytes which correctly passes offset and size() to the put operation.
| @Override | ||
| public byte get(int i) { | ||
| return this.value; | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bug: ConstantBytesValue.get missing required bounds check
The get(int i) method in ConstantBytesValue returns the constant value without checking bounds, violating the documented contract of the abstract Bytes.get(int i) method which specifies it should throw IndexOutOfBoundsException if i < 0 or i >= size(). All other implementations (ArrayWrappingBytes, ConcatenatedBytes, etc.) either explicitly call checkElementIndex or rely on the underlying buffer to throw. This means Bytes.repeat((byte) 0, 5).get(100) would incorrectly return 0 instead of throwing an exception.
fab-10
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
first partial pass
| } | ||
| } | ||
|
|
||
| public static void checkElementIndex(int index, int size) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could be rewritten using checkArgument like for checkLength with a formatted string highlighting the arugments
|
|
||
| static final int MAX_UNSIGNED_SHORT = (1 << 16) - 1; | ||
| static final long MAX_UNSIGNED_INT = (1L << 32) - 1; | ||
| static final long MAX_UNSIGNED_LONG = Long.MAX_VALUE; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the name is misleading, because the value is max signed long.
Moreover, it is not used internally, so could be removed since not public
| @Override | ||
| public Bytes slice(int offset, int length) { | ||
| checkArgument(length >= 0, "Invalid negative length"); | ||
| if (size() > 0) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we allow 0 sized constant bytes in the first place?
| if (this.size() != other.size()) { | ||
| return false; | ||
| } | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IDK if it is worth to add another check that if the other is of the same type then just compare the single value
| @Override | ||
| protected int computeHashcode() { | ||
| int result = 1; | ||
| for (int i = 0; i < size(); i++) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is there really the need to cycle to have a good hash code?
| return EMPTY; | ||
| } | ||
| if (count == values.size()) { | ||
| return new ConcatenatedBytes(values.toArray(new Bytes[0]), totalSize); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
using new Bytes[count] you can save one allocation
| return new ConcatenatedBytes(concatenated, totalSize); | ||
| } | ||
|
|
||
| static Bytes create(List<Bytes> values) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it seems there are no unit tests for this method
| * A Bytes value with just one constant value throughout. Ideal to avoid allocating large byte | ||
| * arrays filled with the same byte. | ||
| */ | ||
| class ConstantBytesValue extends Bytes { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no unit tests for this class
| } | ||
|
|
||
| @Override | ||
| public Bytes getImpl() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should be package
| return new Bytes32(fromArray(bytes, offset)); | ||
| } | ||
|
|
||
| public static Bytes fromArray(byte[] bytes, int offset) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should not this return a Bytes32?
Same for similar methods in the class
PR description
This PR simplifies the Tuweni class hierarchy of the
bytesproject and attempts to get some performance gains by reducing memory churn, specially for copying byte arrays and Bytes instances. Bytes have now a much flatter class hierarchy that will hopefully improve inlining and reduce time onitablelookups which was identified as one of the problems in Besu for low performance. The implementation ofBytesused at runtime is now highly biased towardsArrayWrappingByteswhich makes it easier for JIT to inline and optimize.Also mutable operations wil be moved into
MutableBytesexclusively to show intention in copies. This avoids copies on sequences of logic operations, e.g.MutableBytes.create(10).or(Bytes.fromHex("0x11")).shiftRight(2)no further copies occur after initial instance creation.Other
Bytessubtypes should be immutable from now on to boost performance.Note
Introduce a new flattened bytes v2 implementation with dedicated mutable ops, comprehensive tests, and JMH benchmarks, plus Gradle integration for running benchmarks.
org.apache.tuweni.v2.bytes):Bytescore with concrete wrappers:ArrayWrappingBytes,ConcatenatedBytes,ByteBufferWrappingBytes,BufferWrappingBytes,ByteBufWrappingBytes, and constantConstantBytesValue.MutableBytesfor in-place ops (bitwise AND/OR/XOR/NOT, shifts, pad, increment/decrement) while keeping other implementations immutable.Bytes32andBytes48, parsing/encoding helpers, and utilities (BytesValues,Utils).BytesMegamorphicBenchmarkV1andBytesMegamorphicBenchmarkV2to compare old vs new bytes behavior.assertj-coretest dependency inbytesmodule.Written by Cursor Bugbot for commit a646271. This will update automatically on new commits. Configure here.