Skip to content

Abstract submit and poll operations #19688

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 32 commits into
base: main
Choose a base branch
from

Conversation

lauraneto
Copy link
Contributor

@lauraneto lauraneto commented Jul 7, 2025

Description

This pull request introduces a new LongRunningOperationService service that allows you to run/queue and track long running operations. Also added a repository to keep track of the status, and a few unit and integration tests.
Adjusted the ContentPublishingService.PublishBranchAsync() and DatabaseCacheRebuilder.Rebuild() to use the new service instead of using IBackgroundTaskQueue directly.
Added a new background job LongRunningOperationsCleanupJob that deletes them after some time (now hardcoded to 1 day).

Considerations:

  • For some operations you want to make sure that only one of that type is running at a given time (example: database cache rebuild). This has been implemented in the new service, in a distributed and thread safe way, by using the Run() parameters type and allowConcurrentExecution.
  • Some of the existing operations had the option to run or not in the background. Even if they are not running in the background, we might still need to know if they are already running or not, so that case was also incorporated into this new service.
  • Some of the existing operations return a result when the operation is finished. This means that we need to save the result in a shared place (database) so that different servers can access it. This was also implemented.
  • An expiration has been added to the long running operations after which it can be marked as stale (in case a server goes down without updating the status).
  • Warning: In order to have a good status tracking of operations that don't run in the background, it is not possible to trigger them inside of a scope, as in SQLite this would cause a dead lock (updating the status of the operation would be blocked by the initial scope/transaction). This seems reasonable when running a long operation, so if someone attempts this, they will get an exception.

Methods:

  • Run() - runs or schedules a task/operation.
    • The method accepts:
      • type - type of operation, mostly relevant when allowMultipleRunsOfType is false, but also used to make sure that when an operation is queried the type matches.
      • operation - the task to run (Task when no result, Task<T> when a result is returned, which we need to store).
      • runInBackground - whether to queue the operation in the background, or run it and wait for the result before returning.
      • allowConcurrentExecution - whether you should be able to run or queue an operation when another of the same type is already running, example: database cache rebuild.
Task<Attempt<Guid, LongRunningOperationEnqueueStatus>> Run(
        string type,
        Func<CancellationToken, Task> operation,
        bool allowConcurrentExecution = false,
        bool runInBackground = true,
        TimeSpan? expiryTimeout = null)

Task<Attempt<Guid, LongRunningOperationEnqueueStatus>> Run<T>(
        string type,
        Func<CancellationToken, Task<T>> operation,
        bool allowConcurrentExecution = false,
        bool runInBackground = true,
        TimeSpan? expiryTimeout = null);
  • GetStatus()- gets the status of an operation by id.
Task<LongRunningOperationStatus?> GetStatus(Guid operationId)
  • GetByType()- gets a list of operations by type. Accepts a list of status to use as filter. Defaults to Enqueued or Running.
IEnumerable<LongRunningOperation> GetByType(string type, LongRunningOperationStatus[] statuses);
  • GetResult() - Get the result of an operation.
Task<Attempt<TResult?, LongRunningOperationResultStatus>> GetResult<TResult>(string type, Guid operationId);

TODOs

  • Check how to handle IndexingRebuilderService.TryRebuild - will be done in a separate PR

lauraneto added 19 commits July 3, 2025 11:52
…sks to use this service

This service will manage operations that require status to be synced between servers (load balanced setup).
This is both async and returns an attempt, which will fail if a rebuild operation is already running.
…on-background operations.

Storing an expiration date allows setting different expiration times depending on the type of operation, and whether it is running in the background or not.
… expiration and deletion in `LongRunningOperationRepository.CleanOperations`.
…ons-in-a-LB-friendly-way

# Conflicts:
#	src/Umbraco.Core/DependencyInjection/UmbracoBuilder.cs
@lauraneto lauraneto marked this pull request as ready for review July 9, 2025 17:23
@Copilot Copilot AI review requested due to automatic review settings July 9, 2025 17:23
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR abstracts long-running operations into a unified service, replacing direct background-queue usage in key services, and adds infrastructure to persist, track, and clean up those operations.

  • Introduce LongRunningOperationService and its repository for enqueueing, tracking status, and storing results
  • Refactor ContentPublishingService and DatabaseCacheRebuilder to use the new service
  • Add cleanup job, migrations, DI registrations, and comprehensive unit/integration tests

Reviewed Changes

Copilot reviewed 36 out of 36 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
src/Umbraco.Core/Services/LongRunningOperationService.cs Core service to enqueue, run, and track long-running operations
src/Umbraco.Infrastructure/Persistence/Repositories/Implement/LongRunningOperationRepository.cs Repository persisting operation metadata, status, and results
src/Umbraco.Core/Services/ContentPublishingService.cs Refactored branch publish to use LongRunningOperationService
src/Umbraco.PublishedCache.HybridCache/DatabaseCacheRebuilder.cs Refactored rebuild logic to use LongRunningOperationService
src/Umbraco.Infrastructure/BackgroundJobs/Jobs/LongRunningOperationsCleanupJob.cs Recurring job deleting stale operations after one day
src/Umbraco.Core/DependencyInjection/UmbracoBuilderExtensions.cs Register the cleanup job in recurring background jobs
src/Umbraco.Core/DependencyInjection/UmbracoBuilder.Repositories.cs Register new ILongRunningOperationRepository
src/Umbraco.Core/DependencyInjection/UmbracoBuilder.cs Register ILongRunningOperationService
tests/Umbraco.Tests.UnitTests/Umbraco.Core/Services/LongRunningOperationServiceTests.cs Unit tests for LongRunningOperationService
tests/Umbraco.Tests.Integration/Umbraco.Infrastructure/Persistence/Repositories/LongRunningOperationRepositoryTests.cs Integration tests for repository behavior
Comments suppressed due to low confidence (2)

src/Umbraco.Core/Services/ILongRunningOperationService.cs:20

  • The default value for allowConcurrentExecution is false here but is true in the implementation; aligning these defaults will prevent surprising behavior when callers omit this argument.
    Task<Attempt<Guid, LongRunningOperationEnqueueStatus>> Run(

src/Umbraco.Web.Common/DependencyInjection/UmbracoBuilderExtensions.cs:193

  • [nitpick] There's an extra space before this line relative to others in the block; aligning indentation improves readability and consistency.
        builder.Services.AddRecurringBackgroundJob<LongRunningOperationsCleanupJob>();

Copy link
Contributor

@AndyButland AndyButland left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great so far @lauraneto, really impressive. I've tested out the happy paths and they all work as expected. I know you are still working on a few things but will share the points I've found in the code review now.

/// <summary>
/// Represents a repository for managing long-running operations.
/// </summary>
public interface ILongRunningOperationRepository
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could, and probably should, use async methods here. I know we don't in many repositories, but in the newer ones - e.g. IWebhookRepository - we do. I'd suggest reviewing that interface and aligning the methods names, return types and use of async for this one.

Also should align for retrieving collections - so we get paged results.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adjusted!
Also added Async as a suffix to the methods, but only in the repository. Should that also be done for the rest?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I think so. Again I'm only going by looking at what's been done for more recent services and repositories, and I see IWebhookService has Async suffixes.

If and when we get to a point where most things you expect to be async are, then you could argue these suffixes are superfluous, but we are a way from that, and so for now I'd suggest just align as close as possible with the newer instances.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adjusted!

}

/// <inheritdoc />
public TimeSpan Period => TimeSpan.FromMinutes(2);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps we should have configuration for some of these options on when the job runs, and in particular how many days back to clean up. See for example CacheInstructionsPruningJob and HealthCheckNotifierJob.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adjusted in 7c572e5 but not sure if any other changes are needed.

private readonly TimeProvider _timeProvider;
private readonly ILogger<LongRunningOperationService> _logger;

private readonly TimeSpan _timeToWaitBetweenBackgroundTaskStatusChecks = TimeSpan.FromSeconds(10);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps we should have configuration for this, with these being the defaults if no configuration is provided?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also adjusted in 7c572e5.

@lauraneto lauraneto force-pushed the v16/feature/abstract-submit-and-poll-operations-in-a-LB-friendly-way branch from e2a2bb6 to f3c41e4 Compare July 11, 2025 17:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants