Skip to content

Fix MSITE-1033: AbstractDeployMojo.getTopLevelProject() returns wrong project for SCM URLs#7

Open
Copilot wants to merge 6 commits intomasterfrom
copilot/fix-gettoplevelproject-bug
Open

Fix MSITE-1033: AbstractDeployMojo.getTopLevelProject() returns wrong project for SCM URLs#7
Copilot wants to merge 6 commits intomasterfrom
copilot/fix-gettoplevelproject-bug

Conversation

Copy link

Copilot AI commented Oct 19, 2025

Problem

The getTopLevelProject() method in AbstractDeployMojo incorrectly treats all SCM URLs as pointing to the same site, causing it to return the wrong top-level project when a project hierarchy uses different SCM URLs.

For example, consider two projects:

  • Child project: scm:git:git@github.com:codehaus-plexus/plexus-sec-dispatcher.git/
  • Parent project: scm:git:https://github.com/codehaus-plexus/plexus-pom.git/

These clearly point to different repositories, but the method incorrectly identifies them as the same site because SCM URLs are opaque URIs. When parsed as URIs, they only expose the scheme ("scm"), while host and port are both null, making all SCM URLs appear identical.

Solution

This PR fixes the issue by:

  1. Adding maven-scm-api dependency to properly parse SCM URLs using ScmUrlUtils.getProviderSpecificPart()

  2. Creating an extractComparableUrl() helper method that:

    • Detects SCM URLs (starting with scm:)
    • Extracts the provider-specific part (e.g., scm:git:https://github.com/user/repo.githttps://github.com/user/repo.git)
    • Handles SCP-like Git syntax (e.g., git@github.com:user/repo.git) by converting it to a comparable format (ssh://github.com/user/repo.git)
    • Normalizes SVN URLs by converting https:// to http:// scheme for proper comparison (since SVN is a hierarchical VCS where the same repository can be accessed via both protocols)
    • Returns the original URL unchanged for non-SCM URLs
    • Made package-private (not private) to enable direct testing
  3. Updating getTopLevelProject() to use the extracted comparable URLs for site comparison

Examples

Before this fix:

// These would incorrectly be considered the same site:
String url1 = "scm:git:git@github.com:codehaus-plexus/plexus-sec-dispatcher.git/";
String url2 = "scm:git:https://github.com/codehaus-plexus/plexus-pom.git/";
// Both parsed as scheme="scm", host=null, port=-1

After this fix:

// Git URLs - correctly recognized as different sites:
extractComparableUrl(url1) → "ssh://github.com/codehaus-plexus/plexus-sec-dispatcher.git/"
extractComparableUrl(url2) → "https://github.com/codehaus-plexus/plexus-pom.git/"
// Now properly compared using their actual repository URLs

// SVN URLs - correctly normalized to same scheme:
extractComparableUrl("scm:svn:http://svn.apache.org/repos/asf/maven/website")
  → "http://svn.apache.org/repos/asf/maven/website"
extractComparableUrl("scm:svn:https://svn.apache.org/repos/asf/maven/website/components")
  → "http://svn.apache.org/repos/asf/maven/website/components"
// Both normalized to http:// allowing URIPathDescriptor.sameSite() to recognize them as the same site

Testing

Added comprehensive unit tests in AbstractDeployMojoTest covering:

  • Different SCM repositories with SCP syntax (correctly identified as different sites)
  • Same SCM repositories (correctly identified as the same site)
  • Different HTTPS SCM repositories with different domains (correctly identified as different sites)
  • Non-SCM URLs (existing behavior preserved)
  • SVN URLs with different schemes (http vs https) and subpaths (correctly normalized and identified as the same site)

All tests pass (10 total, including 5 new tests).

Security

  • No security vulnerabilities in the new maven-scm-api dependency (version 2.1.0)
  • CodeQL analysis shows no security issues introduced

Backward Compatibility

Fully maintained - non-SCM URLs continue to work exactly as before.

Fixes apache#1033

Original prompt

This section details on the original issue you should resolve

<issue_title>[MSITE-1033] AbstractDeployMojo.getTopLevelProject(Project) returns wrong project for SCM URLs</issue_title>
<issue_description>### Affected version

.

Bug description

The method getTopLevelProject(project) (

maven-site-plugin/src/main/java/org/apache/maven/plugins/site/deploy/AbstractDeployMojo.java

Line 508 in 5a0e56d

protected MavenProject getTopLevelProject(MavenProject project) throws MojoExecutionException {
) returns the wrong project when all projects in the hierarchy have a "DistributionManagement->Site->URL" leveraging the SCM URL format: https://maven.apache.org/scm/scm-url-format.html (e.g. scm:git:git@github.com:codehaus-plexus/plexus-sec-dispatcher.git/). The method internally relies on https://github.com/apache/maven-doxia-sitetools/blob/9af3a745919525481e96bb4f0c18d62cf41446b4/doxia-site-model/src/main/java/org/apache/maven/doxia/site/inheritance/URIPathDescriptor.java#L191 which returns
true if ... shares the same scheme, host and port with the given URI.

For example https://github.com/codehaus-plexus/plexus-sec-dispatcher/blob/master/pom.xml inherits from https://github.com/codehaus-plexus/plexus-pom/blob/master/pom.xml but for sure both projects don't share a common site. However the latter is returned as top project as

scm:git:https://github.com/codehaus-plexus/plexus-pom.git/ and scm:git:git@github.com:codehaus-plexus/plexus-sec-dispatcher.git/ are incorrectly assumed to be the same because this URI is being parsed as

host = null
scheme = "scm"
port = -1
(this is an opaque URI according to https://docs.oracle.com/javase/8/docs/api/java/net/URI.html#isOpaque--)

Issue Links:

MSITE-600 site plugin 3.0 does not permit a child to fully override parent site deployment URL

MSITE-1032 AbstractDeployMojo.getDeployModuleDirectory() does not work for Git SCM URLs

Remote Links:

apache#227

Any idea here? For Git I would say different SCM URL always means nothing is shared. For SVN this used to be different (as SVN is hierarchical while Git is not).

Probably for SCM URLs extracting the provider specific part via https://github.com/apache/maven-scm/blob/master/maven-scm-api/src/main/java/org/apache/maven/scm/provider/ScmUrlUtils.java and then treat that part as URI would make sense. In addition the path need be considered as well. If two paths are siblings, then they don't share a site.</issue_description>

Comments on the Issue (you are @copilot in this section)

Fixes #6


💬 Share your feedback on Copilot coding agent for the chance to win a $200 gift card! Click here to start the survey.

Copilot AI changed the title [WIP] Fix getTopLevelProject method for SCM URLs Fix MSITE-1033: AbstractDeployMojo.getTopLevelProject() returns wrong project for SCM URLs Oct 19, 2025
Copilot AI requested a review from slachiewicz October 19, 2025 13:38
@slachiewicz slachiewicz added the bug Something isn't working label Oct 19, 2025
@slachiewicz slachiewicz marked this pull request as ready for review October 19, 2025 13:42
@slachiewicz
Copy link
Owner

@copilot different scheme (encapsulated in SCM URL) and subpath should be equal
assertEquals(
AbstractDeployMojo.extractComparableUrl("scm:svn:http://svn.apache.org/repos/asf/maven/website"),
AbstractDeployMojo.extractComparableUrl(
"scm:svn:https://svn.apache.org/repos/asf/maven/website/components"));

Copy link
Author

Copilot AI commented Oct 19, 2025

@copilot different scheme (encapsulated in SCM URL) and subpath should be equal
assertEquals(
AbstractDeployMojo.extractComparableUrl("scm:svn:http://svn.apache.org/repos/asf...

Done in commit b73fc21. Made extractComparableUrl package-private and added SVN URL scheme normalization (https → http) to enable proper comparison. Added comprehensive test that verifies SVN URLs with different schemes and subpaths are correctly recognized as pointing to the same site.

slachiewicz and others added 5 commits October 19, 2025 17:05
…roject

Co-authored-by: slachiewicz <6705942+slachiewicz@users.noreply.github.com>
Co-authored-by: slachiewicz <6705942+slachiewicz@users.noreply.github.com>
- Changed extractComparableUrl from private to package-private for testing
- Added scheme normalization for SVN URLs (https -> http) to enable proper comparison
- Added comprehensive test for SVN URLs with different schemes and subpaths
- Test verifies that SVN URLs with http vs https are recognized as same site

Co-authored-by: slachiewicz <6705942+slachiewicz@users.noreply.github.com>
@slachiewicz slachiewicz force-pushed the copilot/fix-gettoplevelproject-bug branch from 1bff0e2 to dbbb8f9 Compare October 19, 2025 15:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[MSITE-1033] AbstractDeployMojo.getTopLevelProject(Project) returns wrong project for SCM URLs [MSITE-927] Upgrade Parent to 39

2 participants