Skip to content

NMS-19648: CircleCI: Store failed result of flaky tests, improve a test case#8404

Open
mershad-manesh wants to merge 32 commits intodevelopfrom
mem/improvements
Open

NMS-19648: CircleCI: Store failed result of flaky tests, improve a test case#8404
mershad-manesh wants to merge 32 commits intodevelopfrom
mem/improvements

Conversation

@mershad-manesh
Copy link
Copy Markdown
Contributor

All Contributors

External References

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR focuses on improving CI reliability and debuggability around flaky tests by preserving failure evidence across retries, refining CircleCI smoke/itest distribution, and stabilizing a few UI component tests.

Changes:

  • Update CircleCI smoke/itest scripts to preserve failing JUnit XMLs as “flaky evidence” before retrying, and improve test splitting/logging.
  • Introduce a quarantined smoke-test-flaky CircleCI job/suite and adjust existing smoke job parallelism/config behavior.
  • Stabilize several Vue/Vitest UI tests (dialog stubbing, cleanup improvements, and debounce handling) and harden Selenium clicking via retry/obscured-element checks.

Reviewed changes

Copilot reviewed 26 out of 26 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
ui/tests/components/EventConfigurationDetail/EventConfigEventTable.test.ts Adds dialog mocking/cleanup and switches debounce verification to flush() approach.
ui/tests/components/EventConfiguration/EventConfigSourceTable.test.ts Adds stubs for dialogs, improves teardown, and updates debounce tests to use flush().
ui/tests/components/EventConfiguration/Dialog/UploadedFileRenameDialog.test.ts Mocks FeatherDialog and improves teardown by closing dialog before unmount.
ui/tests/components/EventConfiguration/Dialog/CreateEventConfigurationDialog.test.ts Mocks FeatherDialog and adjusts teardown/visibility handling to reduce flakiness.
ui/tests/components/EventConfigEventCreate/BasicInformation.test.ts Ensures fake timers are always restored via try/finally and adds afterEach timer reset.
smoke-test/src/test/java/org/opennms/smoketest/GrafanaEndpointPageIT.java Uses clickElement(By...) helper to reduce click flakiness.
smoke-test/src/main/java/org/opennms/smoketest/ui/framework/Toggle.java Switches to retrying click helper for toggles.
smoke-test/src/main/java/org/opennms/smoketest/ui/framework/Element.java Adds clickWithRetry with scroll + obscured-element detection.
smoke-test/src/main/java/org/opennms/smoketest/ui/framework/Button.java Switches to retrying click helper for buttons.
smoke-test/src/main/java/org/opennms/smoketest/selenium/AbstractOpenNMSSeleniumHelper.java Enhances clickElement with scroll + obscured-element detection.
.circleci/scripts/smoke.sh Filters ITs by suite category, simplifies docker pre-pull, and preserves failing XMLs before retries.
.circleci/scripts/itest.sh Adds node summary logging, preserves failing XMLs before retries, and increases db create threads.
.circleci/scripts/find-tests/git.py Fails fast with a clearer error when .nightly lacks parent_branch.
.circleci/scripts/find-tests/find-tests.py Fixes shebang and removes noisy debug printing.
.circleci/pyscripts/generate_main.py Reduces workflow max_auto_reruns.
.circleci/main/workflows/workflows_v2.json Adds workflow entry for quarantined flaky smoke tests.
.circleci/main/jobs/tests/smoke/smoke-test.index Registers the new smoke-test-flaky job.
.circleci/main/jobs/tests/smoke/smoke-test-sentinel.yml Increases sentinel smoke parallelism.
.circleci/main/jobs/tests/smoke/smoke-test-minion.yml Increases minion smoke parallelism.
.circleci/main/jobs/tests/smoke/smoke-test-flaky.yml Adds a quarantined flaky smoke job (allow failures, retries).
.circleci/main/jobs/tarball-assembly-only.yml Adjusts resource defaults and Maven heap settings.
.circleci/main/commands/oci/trivy-analyze.yml Removes a redundant/incorrect step from the Trivy analyze command.
.circleci/main/commands/generic/generic.yml Avoids repeated apt installs by checking package presence first.
.circleci/main/commands/executions/run-smoke-tests.yml Adds allow-failures mode and collects/stores flaky-evidence artifacts.
.circleci/main/commands/executions/run-integration-tests.yml Collects/stores flaky-evidence artifacts for integration tests.
.circleci/main/commands/executions/run-build.yml Ensures integration tests are skipped for the build-only compile step.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +758 to +773
private boolean isCenterPointObscured(final WebElement element) {
final JavascriptExecutor executor = (JavascriptExecutor)getDriver();
final Object result = executor.executeScript(
"var el = arguments[0];"
+ "if (!el) { return true; }"
+ "var rect = el.getBoundingClientRect();"
+ "if (rect.width === 0 || rect.height === 0) { return true; }"
+ "var x = rect.left + (rect.width / 2);"
+ "var y = rect.top + (rect.height / 2);"
+ "var top = document.elementFromPoint(x, y);"
+ "if (!top) { return true; }"
+ "return top !== el && !el.contains(top);",
element);

return result instanceof Boolean && (Boolean)result;
}
Copy link

Copilot AI Mar 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

isCenterPointObscured logic is now duplicated here and in ui/framework/Element.clickWithRetry(). To avoid the two implementations drifting (especially since both are meant to solve the same click-flakiness problem), consider centralizing this helper in one place and reusing it from both call sites.

Copilot uses AI. Check for mistakes.
cgorantla
cgorantla previously approved these changes Mar 30, 2026
Copy link
Copy Markdown
Contributor

@cgorantla cgorantla left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks fine to me

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 26 out of 26 changed files in this pull request and generated 4 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +20 to +40
| xargs grep -L 'MinionTests\|SentinelTests\|FlakyTests' \
| sed -e 's#^.*src/test/java/\(.*\)\.java#\1#' | tr "/" "." \
> failsafe_classnames
;;
minion)
circleci tests glob '**/src/test/java/**/*IT*.java' \
| xargs grep -l 'MinionTests' \
| sed -e 's#^.*src/test/java/\(.*\)\.java#\1#' | tr "/" "." \
> failsafe_classnames
;;
sentinel)
circleci tests glob '**/src/test/java/**/*IT*.java' \
| xargs grep -l 'SentinelTests' \
| sed -e 's#^.*src/test/java/\(.*\)\.java#\1#' | tr "/" "." \
> failsafe_classnames
;;
flaky)
circleci tests glob '**/src/test/java/**/*IT*.java' \
| xargs grep -l 'FlakyTests' \
| sed -e 's#^.*src/test/java/\(.*\)\.java#\1#' | tr "/" "." \
> failsafe_classnames
Copy link

Copilot AI Mar 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

find_tests() pipes circleci tests glob into xargs grep .... When grep -l/-L finds no matches (or when the glob yields no files), grep exits non-zero and, with set -e -o pipefail, this will abort the script even though an empty test list should be valid. Consider guarding this stage (e.g., use xargs -r and/or wrap the grep stage with || true in a subshell) so failsafe_classnames can be empty without failing the job.

Suggested change
| xargs grep -L 'MinionTests\|SentinelTests\|FlakyTests' \
| sed -e 's#^.*src/test/java/\(.*\)\.java#\1#' | tr "/" "." \
> failsafe_classnames
;;
minion)
circleci tests glob '**/src/test/java/**/*IT*.java' \
| xargs grep -l 'MinionTests' \
| sed -e 's#^.*src/test/java/\(.*\)\.java#\1#' | tr "/" "." \
> failsafe_classnames
;;
sentinel)
circleci tests glob '**/src/test/java/**/*IT*.java' \
| xargs grep -l 'SentinelTests' \
| sed -e 's#^.*src/test/java/\(.*\)\.java#\1#' | tr "/" "." \
> failsafe_classnames
;;
flaky)
circleci tests glob '**/src/test/java/**/*IT*.java' \
| xargs grep -l 'FlakyTests' \
| sed -e 's#^.*src/test/java/\(.*\)\.java#\1#' | tr "/" "." \
> failsafe_classnames
| xargs -r grep -L 'MinionTests\|SentinelTests\|FlakyTests' \
| sed -e 's#^.*src/test/java/\(.*\)\.java#\1#' | tr "/" "." \
> failsafe_classnames || true
;;
minion)
circleci tests glob '**/src/test/java/**/*IT*.java' \
| xargs -r grep -l 'MinionTests' \
| sed -e 's#^.*src/test/java/\(.*\)\.java#\1#' | tr "/" "." \
> failsafe_classnames || true
;;
sentinel)
circleci tests glob '**/src/test/java/**/*IT*.java' \
| xargs -r grep -l 'SentinelTests' \
| sed -e 's#^.*src/test/java/\(.*\)\.java#\1#' | tr "/" "." \
> failsafe_classnames || true
;;
flaky)
circleci tests glob '**/src/test/java/**/*IT*.java' \
| xargs -r grep -l 'FlakyTests' \
| sed -e 's#^.*src/test/java/\(.*\)\.java#\1#' | tr "/" "." \
> failsafe_classnames || true

Copilot uses AI. Check for mistakes.
Comment on lines +78 to +81
- store_artifacts:
when: always
path: /tmp/flaky-evidence
destination: flaky-evidence
Copy link

Copilot AI Mar 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

store_artifacts is set to always upload /tmp/flaky-evidence, but that directory is only created on retry in the scripts. If tests pass without retries, the path may not exist and store_artifacts can fail the step/job. Ensure /tmp/flaky-evidence is always created (e.g., mkdir -p /tmp/flaky-evidence early in the job/scripts) or gate artifact storing on existence.

Copilot uses AI. Check for mistakes.
Comment on lines +76 to +79
- store_artifacts:
when: always
path: /tmp/flaky-evidence
destination: flaky-evidence
Copy link

Copilot AI Mar 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

store_artifacts always uploads /tmp/flaky-evidence, but that directory is only created when retries happen in itest.sh. When the first attempt passes, the directory may not exist and store_artifacts can fail the step/job. Create the directory unconditionally (e.g., early in the job/script) or only run store_artifacts when it exists.

Copilot uses AI. Check for mistakes.
// Before debounce time
it('does not call store immediately on input (debounce)', () => {
wrapper.vm.onChangeSearchTerm('test')
expect(store.onChangeEventsSearchTerm).not.toHaveBeenCalled()
Copy link

Copilot AI Mar 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This debounce test calls onChangeSearchTerm('test') but never flushes/cancels the debounced function. With real timers enabled, the pending debounce may fire after the test completes and bleed into subsequent tests (mock call counts/state). Cancel (or flush) the debounced function at the end of this test or in afterEach (e.g., onChangeSearchTerm.cancel()).

Suggested change
expect(store.onChangeEventsSearchTerm).not.toHaveBeenCalled()
expect(store.onChangeEventsSearchTerm).not.toHaveBeenCalled()
wrapper.vm.onChangeSearchTerm.cancel()

Copilot uses AI. Check for mistakes.
cgorantla
cgorantla previously approved these changes Mar 30, 2026
Copy link
Copy Markdown
Contributor

@cgorantla cgorantla left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

Copy link
Copy Markdown
Contributor

@synqotik synqotik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks good, just one change to make.

element);

return result instanceof Boolean && (Boolean)result;
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree with this (Copilot comment), isCenterPointObscured should only appear once. I'm thinking make it a public static function in Element, which would actually look more like the implementation here. The Javascript checks if the element is null so you don't have to do that explicitly in the Java code.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants