Conversation
* Deleting a document moves it to the .trash folder at the root of the site storage. If there is already a document with the same name a unique suffix is added to the document name. * Deleting a folder moves all of its content to the .trash under a folder with the same name. If a folder with this name already exists in the trash a unique suffix is added. Each document moved to the trash has a new metadata property 'doc-path' with the path it was originally located at. Contributes to: #169
|
This PR will trigger a minor release when merged. |
When a file is moved (or copied) with the 'unique' collision option configured, it will be given a unique name by adding a suffix. This commit moves that suffix inside the extension rather than appending it, so for example if the move is done on page1.html to the .trash, and there is already a file called page1.html in the trash, the newly trashed document will be called page1-abfsg7h1.html
| delimiter: '/', | ||
| prefix: 'org1/site2/.trash/b/', | ||
| }) | ||
| .reply(200, Buffer.from(BUCKET_LIST_EMPTY_TRASH)); |
There was a problem hiding this comment.
why don't use you nock.listObjects() ? or nock.listFolder() ?
There was a problem hiding this comment.
You mean add listObjects() or listFolder() to the Nock implementation? Because I don't see them there at this time...
There was a problem hiding this comment.
I think you need to merge main to see them, they were added some time after you started working on your branch.
There was a problem hiding this comment.
Ah thanks! I have updated the test to use nock.listObjects() 👍
There are some existing tests that use the old approach. Maybe we should convert them in a seperate PR to avoid this one growing too big...?
tripodsan
left a comment
There was a problem hiding this comment.
i have the feeling the code get more and morel over the place. you start exporting common methods in put that are used else where.
maybe it would make sense to separate the source-bus operations from the method handlers. i.e. create a source-bus client.
I would also create a source bus "path" class, that encapsulates all the from/to key,path,name, etc methods.
I also don't like the every growing argument list for the copy operations. maybe this can be consolidated in a combined "CopyOptions" argument (including the "move" flag)
Co-authored-by: Tobias Bocanegra <tripod@bocanegra.ch>
…lass Replace individual parameters (src, info, move, opts/fnOpts, collOpts) with a single CopyOptions object for both copyDocument and copyFolder, keeping only context as a separate argument. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Move getS3Key, getS3KeyFromInfo, and getDocPathFromS3Key out of utils.js into a dedicated s3-path-utils module. Add tests for all three functions. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
Hi @tripodsan , thanks for the comments!
I have created
Created
I have added a CopyOptions class to |
| // Trash a document. | ||
| const docName = info.rawPath.split('/').pop(); | ||
| const srcKey = getS3KeyFromInfo(info); | ||
| const newInfo = RequestInfo.clone(info, { path: `/.trash/${docName}` }); |
There was a problem hiding this comment.
is this desired? why don't you include the path somehow? otherwise, deleting common files, like index.html will end up overwriting.
There was a problem hiding this comment.
This is similar to how a Trash works in an OS environment, e.g. on Mac OS individually deleted files end up in the root of the trash, if there is already a file with that name, they get a unique suffix:
We do a similar thing. So files are not overwritten, but if one with the name already exists, it will get a unique suffix (alpha sortable to get the order in which they are deleted):
| // We do this by appending a ms timestamp to the destination key. | ||
|
|
||
| const ext = `.${destKey.split('.').pop()}`; | ||
| const destWithoutExt = destKey.slice(0, -ext.length); |
There was a problem hiding this comment.
isn't this the same as path.basename ?
There was a problem hiding this comment.
We just want the path to the file, without the extension (so we can insert the unique string). So if we had o1/s1/to.html we want o1/s1/to. With path:
path.basenamewould give me justto.htmlpath.dirnamegives meo1/s1path.extnamegives me.html
I guess I could use these to get the same, but it wouldn't be much simpler I think and I would have to importpath...
tripodsan
left a comment
There was a problem hiding this comment.
break up copyWithRetry
| * - 'unique' - append a (base-36) encoded timestamp to the destination key to make it unique. | ||
| * These timestamps are alphabetically sortable. | ||
| */ | ||
| async function copyWithRetry( |
There was a problem hiding this comment.
this is very unwieldy. I would break this:
- one function that does the retry and one function that does the copy
- the copy should be further split in all the collision handling functions
| function getUser(context) { | ||
| const email = context.attributes.authInfo?.profile?.email; | ||
|
|
||
| return email || 'anonymous'; | ||
| } |
There was a problem hiding this comment.
there is no way the authInfo is not set. aren't you looking for:
const { authInfo } = context;
return authInfo.resolveEmail() || 'anonymous';
?
There was a problem hiding this comment.
Good point - I changed this.
| const opts = { e, log }; | ||
| opts.status = e.status; | ||
| const opts = { e, log: context.log }; | ||
| opts.status = e.$metadata?.httpStatusCode; |
There was a problem hiding this comment.
@dominique-pfister since delete now doesn't do bucket.remove() any more but rather a copy, we need to go back to checking e.$metadata?.httpStatusCode am I correct?
There was a problem hiding this comment.
I don't think so: AFAICS, Bucket operations catch all errors thrown and rethrow an exception where the information in $metadata is available in the exception, e.g.:
https://github.com/adobe/helix-shared/blob/726a38139542da81fc0a1b26a414339b8ba6562b/packages/helix-shared-storage/src/storage.js#L430-L438
| } | ||
|
|
||
| if (collOpts.collision === 'overwrite') { | ||
| const { mdOpts: mdo, copyOpts } = await versionBeforeOverwrite(context, destinationKey, mdOpts); |
There was a problem hiding this comment.
It seems counterintuitive that a method called getCopyRetryOpts silently creates a version before returning.
There was a problem hiding this comment.
Good point - i've renamed it to handleCopyRetry(), hope that works?
| * @param {import('../support/RequestInfo').RequestInfo} info request info | ||
| * @return {Promise<Response>} response, status 204 if successful. | ||
| */ | ||
| export async function deleteSource(context, info) { |
There was a problem hiding this comment.
I wouldn't call it deleteSource, as it's actually a moving a source to trash, may be call it trashSource?
There was a problem hiding this comment.
I actually had this for a while and then changed it back. Let me re-apply that name change.
| const copyOptions = new CopyOptions({ | ||
| src: srcKey, info: newInfo, move: true, opts: copyOpts, collOpts: { collision: 'unique' }, | ||
| }); |
There was a problem hiding this comment.
I know this is a late remark, but I wonder whether the deletion of the source in a move shouldn't be done by this caller instead of passing it down to an implementation that looks at move and then deletes the source. Or is the protocol for copying - depending on whether it should be a move - completely different? Then may be use 2 methods instead of one?
There was a problem hiding this comment.
Thanks for the comment!
I've changed the implementation so that move is no longer a parameter (which is nice as it reduces the amount of options) and now calls copy and then delete.
One thing was a little interesting, on a copy the document gets a new document ID, but on a move this doesn't happen, so the outer copy now does this and the move calls a copy one level after.
* rename deleteSource() to trashSource() * rename getCopyRetryOpts() to handleCopyRetry()
As suggested in the feedback comment.
Each document moved to the trash has a new metadata property 'doc-path' with the path it was originally located at.
Also included here, the returned value from a copy operation now doesn't include org and site components of the paths any more, as these are not in the caller space:
So:
is now
Related Issues
Contributes to: #169