Custom file metadata not persisted across user-defined functions in AWS S3 Multipart

I’m experiencing an issue setting custom metadata fields on a file using setFileMeta, and I’m not sure whether this behavior is a bug or an intentional constraint.

The uploader I’m using is AWS S3 Multipart. The behavior may or may not occur in other uploaders, although I’m only familiar with AWS S3 and AWS S3 Multipart. I suspect that it may be likelier to occur in AWS S3 Multipart because of the number of function options in that uploader, though; from a skim of the docs of the other uploaders, there appear to be fewer function options.

Here’s the scenario: In createMultipartUpload, I am returning a URL for the file/multipart upload from our server to the client and setting it as a metadata field with setFileMeta in order to minimize the routing logic encoded directly in the client code. (Our application already does something similar for non-multipart uploads.) I was expecting for that URL to be available in the file.meta object in subsequent function calls such as signPart, but it is not present on the file object passed to signPart (and presumably might not be present in other functions, although signPart is the cause of the specific error).

The behavior and its cause, as far as I can determine from debugging in my browser and reading the source code, are as follows:

  1. The createMultipartUpload option I set calls uppy.setFileMeta with my custom metadata field, and the custom metadata field is set in the store for that file.
  2. In uploadFile, getUploadId returns the uploadId and key returned by my custom createMultipartUpload function, as expected.
  3. uploadFile continues to call uploadChunk on each chunk, which in turn calls #fetchSignature, which is wrapping my custom signPart function.
  4. The file argument passed to uploadChunk, though, is the same object that was passed to uploadFile originally because it has never been refetched from the store with getFile, and therefore does not have the custom metadata that I set in createMultipartUpload.
  5. The file object in the store does have the custom metadata I set, so if I call getFile in signPart, I will be able to access the custom metadata. However, it’s easy to imagine a future scenario where this might not be the case if the uploader calls setFileState at some point with the stale object metadata, although the current code does guard against this, for example, in setS3MultipartState.

So my question is, is this behavior by design, or is it a bug?

More detailed questions if the behavior is by design:

  1. Is this an appropriate use of setFileMeta and file.meta?
  2. If this is an intentional limitation of the AWS S3 Multipart uploader, can that limitation be documented explicitly with guidance to call getFile in subsequent functions?

Thanks for any guidance you can offer on this subject.

Hi there! Thanks for the report, I think it does qualify as a bug, and I agree it needs to either be fixed, or to be properly documented if there are good reasons (maybe performance) behind this. I’ve opened File objects passed to AWS S3 Multipart contains outdated state information · Issue #4533 · transloadit/uppy · GitHub in order to track this.

1 Like

Thanks so much! Appreciate the answer and the new Github issue.

Knowing that it’s a bug and is (likely) a relatively simple fix, if it should be fixed in code, I would potentially be willing to submit a PR. The main thing that would be useful to know to contribute a fix is if there are any particular characteristics you’d like to see in the fix (for example, implementing a wrapper for such function options, refreshing the file when functions are called by calling them with getFile(file.id) instead of file, or simply inserting a file = getFile(file.id) after each function is called). Or if you think there’s a reason, like performance, that this issue should be documented as a constraint rather than fixed via code, then I would leave that to you/other maintainers to document.

It seems to me that all those options are equivalent, so you could use whichever you find the most readable. Looking forward for your PR then :slight_smile:

I put a PR up here: Refresh file before calling user-defined functions in AWS S3 Multipart.