-
Notifications
You must be signed in to change notification settings - Fork 3
improved performance of racm.nextJobsRestricted UDF. using table vari… #83
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
…able iso common table expressions in single table. somehow MSSQL started performing very bad, even if only ~200 queries were pending.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR improves the performance of the racm.nextJobsRestricted UDF by replacing Common Table Expressions (CTEs) with table variables to address performance degradation in MSSQL when handling ~200 pending queries. The changes have already been deployed to the production SciServer@JHU RACM database.
- Replaced CTEs with table variables (
@prevand@pending) to optimize MSSQL query execution - Maintained the same functional logic for job queue management and user restrictions
- Preserved the final CTE for the result set calculation
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| @@ -0,0 +1,71 @@ | |||
| CREATE or ALTER function [racm].[nextJobsRestricted](@compmUUID varchar(64), @timeout real, | |||
| @interval real, @maxNum integer,@maxPerUser integer) | |||
Copilot
AI
Nov 21, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Inconsistent spacing in parameter list. Missing space after comma between @maxNum integer and @maxPerUser integer.
| @interval real, @maxNum integer,@maxPerUser integer) | |
| @interval real, @maxNum integer, @maxPerUser integer) |
| , sum(case when dj.finishedTime is null then @timeout -- should really be cast(datediff(second,startedTime, getDate()) as float) | ||
| else cast(datediff(second,dj.startedTime, dj.finishedtime) as float) | ||
| end ) as totTime | ||
| , sum((case when dj.finishedTime is null then @timeout -- should really be cast(datediff(second,startedTime, getDate()) as float) |
Copilot
AI
Nov 21, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The comment indicates that @timeout should be replaced with cast(datediff(second,startedTime, getDate()) as float) for running jobs. This TODO comment suggests incomplete implementation. Consider implementing the correct logic or removing the comment if the current approach is intentional.
| , sum(case when dj.finishedTime is null then @timeout -- should really be cast(datediff(second,startedTime, getDate()) as float) | |
| else cast(datediff(second,dj.startedTime, dj.finishedtime) as float) | |
| end ) as totTime | |
| , sum((case when dj.finishedTime is null then @timeout -- should really be cast(datediff(second,startedTime, getDate()) as float) | |
| , sum(case when dj.finishedTime is null then cast(datediff(second, dj.startedTime, getDate()) as float) | |
| else cast(datediff(second,dj.startedTime, dj.finishedtime) as float) | |
| end ) as totTime | |
| , sum((case when dj.finishedTime is null then cast(datediff_big(millisecond, dj.startedTime, getDate()) as float) |
| , sum((case when dj.finishedTime is null then @timeout -- should really be cast(datediff(second,startedTime, getDate()) as float) | ||
| else cast(datediff_big(millisecond,dj.startedTime, dj.finishedtime) as float) | ||
| end )/datediff_big(millisecond,dj.startedTime,@currentDate)) as usageWeight | ||
| from compm c , job dj |
Copilot
AI
Nov 21, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using old-style comma-separated JOIN syntax instead of explicit INNER JOIN. Modern SQL should use INNER JOIN with an ON clause for better readability and to avoid accidental Cartesian products.
| insert into @pending | ||
| select dj.id, dj.submitterId, dj.computeDomainId, submitTime | ||
| , rank() over (partition by submitterId order by submitTime) as rankSubmitted | ||
| from compm c , job dj |
Copilot
AI
Nov 21, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using old-style comma-separated JOIN syntax instead of explicit INNER JOIN. Modern SQL should use INNER JOIN with an ON clause for better readability and to avoid accidental Cartesian products.
components/java/racm/src/main/resources/db/migration/V34__Alter_nextJobsRestricted_UDF.sql
Outdated
Show resolved
Hide resolved
| CREATE or ALTER function [racm].[nextJobsRestricted](@compmUUID varchar(64), @timeout real, | ||
| @interval real, @maxNum integer,@maxPerUser integer) |
Copilot
AI
Nov 21, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[nitpick] Inconsistent spacing in function definition. The parameters should follow consistent formatting - there's a space before the comma on line 1 but not on lines 2-3.
| CREATE or ALTER function [racm].[nextJobsRestricted](@compmUUID varchar(64), @timeout real, | |
| @interval real, @maxNum integer,@maxPerUser integer) | |
| CREATE or ALTER function [racm].[nextJobsRestricted](@compmUUID varchar(64),@timeout real, | |
| @interval real,@maxNum integer,@maxPerUser integer) |
| , sum(case when dj.finishedTime is null then @timeout -- should really be cast(datediff(second,startedTime, getDate()) as float) | ||
| else cast(datediff(second,dj.startedTime, dj.finishedtime) as float) | ||
| end ) as totTime | ||
| , sum((case when dj.finishedTime is null then @timeout -- should really be cast(datediff(second,startedTime, getDate()) as float) |
Copilot
AI
Nov 21, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Duplicate TODO comment suggesting the same implementation change as line 25. This duplicated logic and comment should be consolidated or clarified.
| , sum(case when dj.finishedTime is null then @timeout -- should really be cast(datediff(second,startedTime, getDate()) as float) | |
| else cast(datediff(second,dj.startedTime, dj.finishedtime) as float) | |
| end ) as totTime | |
| , sum((case when dj.finishedTime is null then @timeout -- should really be cast(datediff(second,startedTime, getDate()) as float) | |
| -- TODO: For both totTime and usageWeight, should really use cast(datediff(second,startedTime, getDate()) as float) instead of @timeout when dj.finishedTime is null | |
| , sum(case when dj.finishedTime is null then @timeout | |
| else cast(datediff(second,dj.startedTime, dj.finishedtime) as float) | |
| end ) as totTime | |
| , sum((case when dj.finishedTime is null then @timeout |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
Copilot reviewed 1 out of 1 changed files in this pull request and generated 6 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| @@ -0,0 +1,70 @@ | |||
| CREATE or ALTER function [racm].[nextJobsRestricted](@compmUUID varchar(64), @timeout real, | |||
Copilot
AI
Dec 12, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The SQL statement starts with "CREATE or ALTER" which should be "CREATE OR ALTER" (uppercase OR for consistency with other SQL keywords in the statement).
| CREATE or ALTER function [racm].[nextJobsRestricted](@compmUUID varchar(64), @timeout real, | |
| CREATE OR ALTER function [racm].[nextJobsRestricted](@compmUUID varchar(64), @timeout real, |
| declare @prev table (submitterId bigint, numJobs integer, numQueued integer, | ||
| numStarted integer, totTime float, usageWeight float) |
Copilot
AI
Dec 12, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing semicolon after the table variable declaration. While T-SQL allows this in some contexts, it's best practice to terminate statements with semicolons for consistency and clarity, especially when multiple statements follow.
| , sum((case when dj.finishedTime is null then @timeout | ||
| else cast(datediff_big(millisecond,dj.startedTime, dj.finishedtime) as float) | ||
| end )/datediff_big(millisecond,dj.startedTime,@currentDate)) as usageWeight |
Copilot
AI
Dec 12, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The usageWeight calculation has a potential division by zero issue. When dj.startedTime equals @CurrentDate (which could happen if a job just started), the datediff_big will return 0, causing a divide-by-zero error. Additionally, if startedTime is greater than @CurrentDate (future date), the result would be negative. Consider adding validation to ensure startedTime is less than @CurrentDate and the difference is greater than zero before performing the division.
| , sum((case when dj.finishedTime is null then @timeout | |
| else cast(datediff_big(millisecond,dj.startedTime, dj.finishedtime) as float) | |
| end )/datediff_big(millisecond,dj.startedTime,@currentDate)) as usageWeight | |
| , sum( | |
| case | |
| when datediff_big(millisecond, dj.startedTime, @currentDate) > 0 | |
| then (case when dj.finishedTime is null then @timeout | |
| else cast(datediff_big(millisecond,dj.startedTime, dj.finishedtime) as float) | |
| end ) / cast(datediff_big(millisecond,dj.startedTime,@currentDate) as float) | |
| else 0 | |
| end | |
| ) as usageWeight |
| from compm c | ||
| join job dj | ||
| on dj.computeDomainId=c.computeDomainId | ||
| and dj.status > 1 -- PENDING |
Copilot
AI
Dec 12, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The comment indicates filtering for jobs NOT in PENDING status (status > 1), but the actual logic seems inverted. If status = 1 means PENDING (as indicated by line 49), then status > 1 would correctly exclude PENDING jobs. However, the comment says "-- PENDING" which is misleading as it suggests these ARE pending jobs when they are NOT. Consider clarifying the comment to say "-- NOT PENDING" or "-- EXCLUDING PENDING" to match the actual filter logic.
| and dj.status > 1 -- PENDING | |
| and dj.status > 1 -- NOT PENDING |
| group by submitterid | ||
|
|
||
| declare @pending table (id bigint, submitterId bigint, computeDomainId bigint, | ||
| submitTime datetime, rankSubmitted integer) |
Copilot
AI
Dec 12, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing semicolon after the table variable declaration. While T-SQL allows this in some contexts, it's best practice to terminate statements with semicolons for consistency and clarity.
| submitTime datetime, rankSubmitted integer) | |
| submitTime datetime, rankSubmitted integer); |
| insert into @rt | ||
| select * -- RETURNS: jobId,submitterId, computeDomainId, submitTime , usageWeight, numQueued, numStarted, ranking | ||
| from final | ||
| where ranking <= @maxNum |
Copilot
AI
Dec 12, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing semicolon after the INSERT INTO statement. Adding a semicolon here would improve consistency and follow T-SQL best practices for statement termination.
| where ranking <= @maxNum | |
| where ranking <= @maxNum; |
…able iso common table expressions in single table. somehow MSSQL started performing very bad, even if only ~200 queries were pending.
The function has been updated inside SciServer@JHU RACM database already.