Skip to content

Epicure bsc gpu#8

Open
loganoz wants to merge 2 commits intomainfrom
epicure_bsc_gpu
Open

Epicure bsc gpu#8
loganoz wants to merge 2 commits intomainfrom
epicure_bsc_gpu

Conversation

@loganoz
Copy link
Owner

@loganoz loganoz commented Oct 30, 2025

After today's meeting, I thought that this might be a good idea. I took epicure_bsc_gpu branch directly from their fork and brought it here, without accepting the pull request to Gerasimos_gpu branch in the main folder.

@Jerryntk
Copy link
Collaborator

In the split form templates please check the consistency of the signs and the initialisation of Qdot = 0 compared to the standard code. To make everything compatible for the mu solver, we had to adjust some integral operations. This is really important. The changes made to integrate the mu solver lead to the aforementioned changes in the ns solver.

@loganoz
Copy link
Owner Author

loganoz commented Oct 30, 2025

I think we should have splitforms test cases working before accepting this. Currently only standard form is checked.

@Jerryntk
Copy link
Collaborator

I would suggest to merge first the mesh related changes and on a second step the split form changes. I was holding them off because of the mu integration.

@loganoz
Copy link
Owner Author

loganoz commented Oct 30, 2025

You think mesh related are safe then?

@Jerryntk
Copy link
Collaborator

I havent checked them all. We need to check it and understand the changes, aside from verification through testing.

@loganoz
Copy link
Owner Author

loganoz commented Oct 30, 2025

100% agree

@Jerryntk
Copy link
Collaborator

I propose that at least one from the team should check it thoroughly to be aware exactly of what the changed code does.

@loganoz loganoz requested a review from Copilot November 3, 2025 07:17
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR implements performance optimizations across several computational mesh processing routines, introducing algorithmic improvements, refactoring duplicated code through templates, and adding GPU acceleration support via OpenACC directives.

  • Optimizes node lookup in HOPR mesh reading by replacing O(n) linear search with O(1) hash map lookup
  • Refactors mesh partitioning to use mask-based node uniqueness detection instead of nested loops
  • Consolidates duplicated split volumetric contribution functions into a single template with preprocessor macros
  • Optimizes metric computation by reducing array dimensions and introducing local accumulator variables
  • Adds OpenACC directives for GPU acceleration of wall distance computations

Reviewed Changes

Copilot reviewed 7 out of 7 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
RiemannSolvers_NS.f90 Exports two-point flux functions for use in split form template
Read_HDF5Mesh_HOPR.f90 Introduces reverse lookup map for O(1) node deduplication
MeshPartitioning.f90 Replaces nested loop node search with mask-based approach
MappedGeometry.f90 Refactors metric computation with optimized array dimensions and local accumulators
HexMesh.f90 Adds OpenACC acceleration for wall distance calculations
split_template.f90 New template file for generating specialized split form subroutines
SpatialDiscretization.f90 Replaces monolithic function with template-based approach using preprocessor directives
Comments suppressed due to low confidence (1)

Solver/src/libs/mesh/Read_HDF5Mesh_HOPR.f90:1238

  • The signature of 'InitNodeMap' is missing the 'NodeToHOPRMap' parameter that is being allocated within the subroutine body at line 1243. This parameter must be added to the subroutine signature with intent(inout) to match the usage in 'AddToNodeMap' and 'FinishNodeMap'.
   subroutine InitNodeMap (TempNodes , HOPRNodeMap, nUniqueNodes)
      implicit none
      !--------------------------------------------
      real(kind=RP), allocatable, intent(inout) :: TempNodes(:,:)
      integer      , allocatable, intent(inout) :: HOPRNodeMap(:)
      integer                   , intent(in)    :: nUniqueNodes

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

integer, allocatable :: HOPRNodeMap(:) ! Map from the global node index of HORSES3D to the global node index of HOPR
real(kind=RP), allocatable :: TempNodes(:,:) ! Nodes read from file to be exported to self % nodes
logical :: CurveCondition
integer, allocatable :: NodeToHOPRMap(:)
Copy link

Copilot AI Nov 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The variable 'NodeToHOPRMap' is declared locally in 'ConstructMesh_FromHDF5File_' but is used in module-level subroutines 'InitNodeMap', 'AddToNodeMap', and 'FinishNodeMap'. It should either be declared as a module-level variable or passed explicitly as a parameter to these subroutines. The current implementation will not compile because the variable is out of scope.

Copilot uses AI. Check for mistakes.
self % faces(eID) % geom % dWall(i, j) = sqrt(minimumDistance)
end do
!$acc end parallel loop
!$acc update self(self % faces(eID) % geom % dWall(:j)) async(j)
Copy link

Copilot AI Nov 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Syntax error in OpenACC update directive: ':j' should be ':,j' to specify the array slice correctly. The current syntax is missing the comma separator for the array dimensions.

Suggested change
!$acc update self(self % faces(eID) % geom % dWall(:j)) async(j)
!$acc update self(self % faces(eID) % geom % dWall(:,j)) async(j)

Copilot uses AI. Check for mistakes.
real(kind=RP) :: x(3)

real(kind=RP) :: jGradXi_local(NDIM), jGradEta_local(NDIM), jGradZeta_local(NDIM), jacobian_local
real(kind=RP) :: auxgrad_l1(NDIM), auxgrad_l2(NDIM), auxgrad_l3(NDIM)
Copy link

Copilot AI Nov 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The variables 'auxgrad_l1', 'auxgrad_l2', and 'auxgrad_l3' are declared but never used in the subroutine. These should be removed to avoid confusion and reduce memory allocation.

Suggested change
real(kind=RP) :: auxgrad_l1(NDIM), auxgrad_l2(NDIM), auxgrad_l3(NDIM)

Copilot uses AI. Check for mistakes.
Comment on lines +55 to +58
Q_acc = Q_acc + NodalStorage(mesh % elements(eID) % Nxyz(1)) % hatD(i,l) * mesh % elements(eID) % storage % contravariantFlux(eq,l,j,k,IX) + &
NodalStorage(mesh % elements(eID) % Nxyz(2)) % hatD(j,l) * mesh % elements(eID) % storage % contravariantFlux(eq,i,l,k,IY) + &
NodalStorage(mesh % elements(eID) % Nxyz(3)) % hatD(k,l) * mesh % elements(eID) % storage % contravariantFlux(eq,i,j,l,IZ)
end do
Copy link

Copilot AI Nov 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Loop index mismatch: the loop variable 'l' iterates only up to 'mesh % elements(eID) % Nxyz(1)', but it's used to index contributions from all three dimensions (hatD for Ny and Nz as well). The contributions for IY and IZ directions should use separate loops with upper bounds Nxyz(2) and Nxyz(3) respectively, or the loop bound should be max(Nxyz(1), Nxyz(2), Nxyz(3)).

Suggested change
Q_acc = Q_acc + NodalStorage(mesh % elements(eID) % Nxyz(1)) % hatD(i,l) * mesh % elements(eID) % storage % contravariantFlux(eq,l,j,k,IX) + &
NodalStorage(mesh % elements(eID) % Nxyz(2)) % hatD(j,l) * mesh % elements(eID) % storage % contravariantFlux(eq,i,l,k,IY) + &
NodalStorage(mesh % elements(eID) % Nxyz(3)) % hatD(k,l) * mesh % elements(eID) % storage % contravariantFlux(eq,i,j,l,IZ)
end do
Q_acc = Q_acc + NodalStorage(mesh % elements(eID) % Nxyz(1)) % hatD(i,l) * mesh % elements(eID) % storage % contravariantFlux(eq,l,j,k,IX)
end do
!$acc loop seq
do l = 0, mesh % elements(eID) % Nxyz(2)
Q_acc = Q_acc + NodalStorage(mesh % elements(eID) % Nxyz(2)) % hatD(j,l) * mesh % elements(eID) % storage % contravariantFlux(eq,i,l,k,IY)
end do
!$acc loop seq
do l = 0, mesh % elements(eID) % Nxyz(3)
Q_acc = Q_acc + NodalStorage(mesh % elements(eID) % Nxyz(3)) % hatD(k,l) * mesh % elements(eID) % storage % contravariantFlux(eq,i,j,l,IZ)
end do
end do

Copilot uses AI. Check for mistakes.
amrueda added a commit that referenced this pull request Nov 6, 2025
…HOPR hdf5 meshes

Co-authored-by: zalbanob <171132623+zalbanob@users.noreply.github.com>
Co-authored-by: Gonzalo Rubio <g.rubio@upm.es>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants