Tusd + Azure: changeFileInfo.ID with folder support

I was able to use changeFileInfo.ID in the pre-create hook to change the name of the uploaded file.

I tried adding a folder structure into the ID and it did not work. e.g.
changeFileInfo.ID = “foldername/filename.ext”;

Should this work? Can support be added for it? Is there a way for me to do it on my own?

I know the generally accepted way to implement a custom file name is via the post-create hook and to move/rename the file directly in the storage. In Azure, this requires a copy + delete of the file … there is no “move” or “rename” method. The copy can be an expensive operation for large files and we’d like to avoid this solution if possible.

There is no intentional limitation that should prevent this from working in tusd. What do you mean exactly by “it did not work”? Did an error appear? Please provide more details.

These are the tusd logs:

2023-10-24 10:59:26 2023/10/24 14:59:26.086512 level=INFO event=RequestIncoming method=OPTIONS path="" requestId=""
2023-10-24 10:59:26 2023/10/24 14:59:26.086591 level=INFO event=ResponseOutgoing method=OPTIONS path="" requestId="" status=200 body=""
2023-10-24 10:59:26 2023/10/24 14:59:26.090195 level=INFO event=RequestIncoming method=POST path="" requestId=""
2023-10-24 10:59:26 2023/10/24 14:59:26.090325 level=DEBUG event=HookInvocationStart type=pre-create id=""
2023-10-24 10:59:27 2023/10/24 14:59:27.203461 level=DEBUG event=HookInvocationFinish type=pre-create id=""
2023-10-24 10:59:27 2023/10/24 14:59:27.210913 level=INFO event=UploadCreated method=POST path="" requestId="" id=foldername/filename.jpg id=foldername/filename.jpg size=246026 url=http://localhost:44333/files/foldername/filename.jpg
2023-10-24 10:59:27 2023/10/24 14:59:27.210941 level=INFO event=ResponseOutgoing method=POST path="" requestId="" id=foldername/filename.jpg status=200 body=""
2023-10-24 10:59:27 2023/10/24 14:59:27.219365 level=INFO event=RequestIncoming method=OPTIONS path=foldername/filename.jpg requestId=""
2023-10-24 10:59:27 2023/10/24 14:59:27.219410 level=INFO event=ResponseOutgoing method=OPTIONS path=foldername/filename.jpg requestId="" status=200 body=""
2023-10-24 10:59:27 2023/10/24 14:59:27.221134 level=INFO event=RequestIncoming method=PATCH path=foldername/filename.jpg requestId=""

This is the error in Uppy:

tus: unexpected response while uploading chunk, originated from request (method: PATCH, url: http://localhost:44333/files/foldername/filename.jpg, response code: 404, response text: 404 page not found\n, request id: n/a)

The initial POST to tusd works fine, and returns 200 and the correct Location response header.

Request:
POST /files HTTP/1.1
...

Response:
HTTP/1.1 200 OK
...
Location: http://localhost:44333/files/foldername/filename.jpg
Tus-Resumable: 1.0.0

But the following PATCH request fails:

Request:
PATCH /files/foldername/filename.jpg HTTP/1.1
...
Content-Length: 246026
Content-Type: application/offset+octet-stream
Host: localhost:44333
...
Tus-Resumable: 1.0.0
Upload-Offset: 0
...

Response:
HTTP/1.1 404 Not Found
...
Tus-Resumable: 1.0.0
...

And my Azurite log looks like:

127.0.0.1 - - [24/Oct/2023:14:59:27 +0000] "PUT /uploads/foldername/filename.jpg.info?timeout=61 HTTP/1.1" 201 -

I think the problem lies in pkg/handler/unrouted_handler.go

reExtractFileID  = regexp.MustCompile(`([^/]+)\/?$`)
...
result := reExtractFileID.FindStringSubmatch(url)

This regex will only capture the last part of the path … the file name.

What we’d want to do is strip the base url, ending in a slash, from the beginning of url, and strip any trailing slashes, to allow for file ids that contain a slash.

You already found the underlying issue :slight_smile: tusd does not support slashes in the upload ID. I will have a look if we can fix that.

I started working on a patch, but this is more involved than just the regex as we need to adjust the request routing as well. We must be very careful as this should easily introduce regressions in some edge cases. See Allow upload ID to contain slashes by Acconut · Pull Request #1020 · tus/tusd · GitHub

1 Like

this is more involved than just the regex as we need to adjust the request routing as well. We must be very careful as this should easily introduce regressions in some edge cases.

Yeah, I am not surprised this is more complicated … forward slash handling can definitely be finicky in web applications. This does seem like something that would be of value for the different stores.

I’ve never worked with Go before so not sure how much help I can be at this point.

When do you think you might get around to merging a fix? And then issuing the next release? A very rough guesstimate is acceptable :sweat_smile:

Thanks for your help @marius !

Interestingly, the same question popped up on GitHub today: Allow using '/' when setting ChangeFileInfo.ID to enable "hierarchival" views in Azure Blob Storage · Issue #1021 · tus/tusd · GitHub

I can let you know once a patched version is available for testing.

Sorry, I cannot provide any estimate right now.

1 Like

Not sure if this is related, but I am running into another error with changing the file ID to a different value.

if (input.Type == "pre-create")
{
    response.ChangeFileInfo = new UploadHookResponseChangeFileInfoDto();
    response.ChangeFileInfo.ID = input.Event.Upload.MetaData["name"];
...

The value of filename is Screenshot 2023-10-30 at 4.34.21 PM.png

This is what I see in the Azurite logs:

127.0.0.1 - - [31/Oct/2023:21:25:21 +0000] "PUT /uploads/Screenshot%202023-10-30%20at%204.34.21%E2%80%AFPM.png.info?timeout=61 HTTP/1.1" 201 -
127.0.0.1 - - [31/Oct/2023:21:25:21 +0000] "GET /uploads/Screenshot%202023-10-30%20at%204.34.21%C3%A2%C2%80%C2%AFPM.png.info?timeout=61 HTTP/1.1" 404 -

The Upload-Metdata header looks like:

Upload-Metadata: relativePath bnVsbA==,name U2NyZWVuc2hvdCAyMDIzLTEwLTMwIGF0IDQuMzQuMjHigK9QTS5wbmc=,type aW1hZ2UvcG5n,organizationId Y2I3MDkxNTAtZGQ0MC0xOTU1LThjZTItM2EwZTk5YzMyMzdi,filetype aW1hZ2UvcG5n,filename U2NyZWVuc2hvdCAyMDIzLTEwLTMwIGF0IDQuMzQuMjHigK9QTS5wbmc=

It seems like tusd accepts the custom file correct, and saves the info file correctly in Azurite … but when it tries to immediately request the info file, the value is escaped incorrectly.

The problem seems like there is a “narrow no break space” character in the file name (this file name came directly from my Mac). %E2%80%AF is what it looks like URL encoded.

If I want to set the file ID to the filename, should I only allow certain characters?

I removed all non-ascii characters and the upload worked:

var filename =
    ShortId.Generate() + "-" +
    Regex.Replace(input.Event.Upload.MetaData["filename"], @"[^\u0000-\u007F]+", string.Empty);

response.ChangeFileInfo = new UploadHookResponseChangeFileInfoDto
{
    ID = filename,
};

So I am not stuck, but it might be worth noting allowed characters in the file ID.

You should only use URL-safe character in the upload ID because it will end up in the URL. tusd should probably have a safety check for this included, I agree.

Tracking this at Disallow non-URL-safe character in `changeFileInfo.ID` · Issue #1030 · tus/tusd · GitHub, thanks for bringing this up.