-
Notifications
You must be signed in to change notification settings - Fork 1.1k
[DatastreamToSpanner] Integration test for CDC processing algorithm for Live migration template #3018
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
[DatastreamToSpanner] Integration test for CDC processing algorithm for Live migration template #3018
Conversation
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #3018 +/- ##
============================================
+ Coverage 50.25% 53.88% +3.62%
+ Complexity 5020 2137 -2883
============================================
Files 967 484 -483
Lines 59261 27905 -31356
Branches 6458 2933 -3525
============================================
- Hits 29783 15037 -14746
+ Misses 27374 11939 -15435
+ Partials 2104 929 -1175
🚀 New features to boost your workflow:
|
4771b32 to
b22f585
Compare
...com/google/cloud/teleport/v2/templates/failureinjectiontesting/DataStreamToSpannerCDCFT.java
Outdated
Show resolved
Hide resolved
...com/google/cloud/teleport/v2/templates/failureinjectiontesting/DataStreamToSpannerCDCFT.java
Show resolved
Hide resolved
...com/google/cloud/teleport/v2/templates/failureinjectiontesting/DataStreamToSpannerCDCFT.java
Show resolved
Hide resolved
...com/google/cloud/teleport/v2/templates/failureinjectiontesting/DataStreamToSpannerCDCFT.java
Show resolved
Hide resolved
...oogle/cloud/teleport/v2/spanner/testutils/failureinjectiontesting/FuzzyCDCLoadGenerator.java
Show resolved
Hide resolved
...ogle/cloud/teleport/v2/spanner/testutils/failureinjectiontesting/CDCCorrectnessTestUtil.java
Outdated
Show resolved
Hide resolved
...ogle/cloud/teleport/v2/spanner/testutils/failureinjectiontesting/CDCCorrectnessTestUtil.java
Outdated
Show resolved
Hide resolved
...ogle/cloud/teleport/v2/spanner/testutils/failureinjectiontesting/CDCCorrectnessTestUtil.java
Outdated
Show resolved
Hide resolved
...com/google/cloud/teleport/v2/templates/failureinjectiontesting/DataStreamToSpannerCDCFT.java
Show resolved
Hide resolved
...com/google/cloud/teleport/v2/templates/failureinjectiontesting/DataStreamToSpannerCDCFT.java
Show resolved
Hide resolved
...ogle/cloud/teleport/v2/spanner/testutils/failureinjectiontesting/CDCCorrectnessTestUtil.java
Outdated
Show resolved
Hide resolved
| ExecutorService executor = Executors.newFixedThreadPool(totalRows); | ||
| try { | ||
| List<Future<?>> futures = new ArrayList<>(); | ||
|
|
||
| for (int i = 0; i < totalRows; i++) { | ||
| futures.add( | ||
| executor.submit( | ||
| () -> { | ||
| // Each thread gets its own connection to ensure thread safety | ||
| try (Connection conn = DriverManager.getConnection(jdbcUrl, username, password)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would it make sense to use a connection pooling library instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A connection pool library is useful when the connection utilization is very dynamic. In our case, we get 100 (num of rows) connections at the beginning and they remain in use till the end of load generation. I used the HikariCP and checked, it is actually taking much more time and degrading the performance.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so our template code will perform worse than the test? is that worth poking?
| int id; | ||
| User user; | ||
| do { | ||
| id = getRandom().nextInt(1000000); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what is 1 million here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
generating a random Id between [0, 1000000) .
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i guess i followed that part. what I am trying to understand is
1/ why not 1 billion? how did we pick a million?
2/ could this be range be different for different tests?
| // If row is present, then update the row with 75% probability and delete the row with 25% | ||
| // probability | ||
| if (isRowPresent) { | ||
| if (getRandom().nextDouble() < 0.75) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
1/ didnt we want to do insert/update/deletes with equal probability?
2/ given its a util, shouldnt this be a configuration that is passed to it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- If we just do Insert or update or delete randomly, Many of the inserts would be no op if the row already exists and many of the deletes would be no op if the row is already deleted, but it would still consume CPU cycles, network bandwidth etc. It doesn't make sense to have such load. insert only when the row doesn't exist, delete or update only when the row exists.
- This whole class is written very specifically for the CDC algorithm correctness IT. I don't see it to be used for other test cases and hence chose to not make it generic and config driven. For example, this class doesn't support multiple tables, ordering of events between the tables etc. Making it generic is over engineering and doesn't make sense to me. If there is a use case in future where we want to re-use this class, then we can enhance it at that time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- it makes sense to handle insert in insert and delete on delete, which IIUC we are already handling. but still unclear to me why that would mean equal probably can be used. is there a doc that walks through how the current distribution is picked?
- I was assuming this class to provide load generation capabilities for CDC tests. I am ok with just having support for one table, but even for that it makes sense to support different workload distributions (Open/Closed Principle). are there any other aspects of this class that are CDC correctness specific that I may be missing?
...ogle/cloud/teleport/v2/spanner/testutils/failureinjectiontesting/CDCCorrectnessTestUtil.java
Outdated
Show resolved
Hide resolved
...ogle/cloud/teleport/v2/spanner/testutils/failureinjectiontesting/CDCCorrectnessTestUtil.java
Outdated
Show resolved
Hide resolved
|
|
||
| // generate Load | ||
| CDCCorrectnessTestUtil testUtil = new CDCCorrectnessTestUtil(); | ||
| testUtil = new FuzzyCDCLoadGenerator(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: may want to use a better name
|
|
||
| /** Unit tests for {@link FuzzyCDCLoadGenerator}. */ | ||
| @RunWith(JUnit4.class) | ||
| public class FuzzyCDCCorrectnessTestUtilTest { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this reflect the class under Test?
| import org.slf4j.Logger; | ||
| import org.slf4j.LoggerFactory; | ||
|
|
||
| public class FuzzyCDCLoadGenerator { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should this be under the main package or test package?
| user.mutateRandomly(); | ||
| user.insert(conn); | ||
| isRowPresent = true; | ||
| counter.incrementAndGet(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
where is this used?
No description provided.