Companion is in a state with my tus server where its repeatedly making 0 byte Content-Length PATCH requests for multiple uploads. I’m currently seeing it attempt a single file upload for over 3 hours this way.
Looking for advice, either a companion config or tus config for aborting requests. Also, any information on how this happens in the first place?
We are running companion 4.1.1 from the transloadit provided docker image on EC2.
Hi. Do you have any way to reproduce the problem? e.g. any URL that you can share that will cause Companion to start sending 0 length PATCH requests? Or do you know if there’s any reason why the URL server could have stopped sending data just before the request has finished?
I believe there could be 2 separate problems here:
For some reason, tus-js-client keeps sending 0 length PATCH requests. I would think this is a bug (unless this is an intentional “hack” implemented in order to keep-alive the connection between the client and tusd, but looking at tus-js-client source code, I cannot find such a functionality.)
Companion doesn’t have any timeout when reading the URL. This is because gotdoesn’t have a timeout by default. Normally this is OK because most servers will time out a request after a while, but for some reason your server is not doing that. I think we should introduce a timeout option in Companion. I will create a PR
No. I’ve had rare instances of it happening in the past but today we saw about 20-30 uploads from multiple customers where the same thing happened. Definitely wouldn’t be the same URL, however companion doesn’t log out the source url and we don’t capture it elsewhere.
One other semi-related concern about companion I have is that a user can close a browser window, and companion will continue the upload in the background. Its likely that is what is happening here. Customers are closing the browser rather than canceling via uppy which would trigger a tus DELETE. We do see that happening from time to time on hanging uploads and it resolves the issue for the customer.
BTW did you see the beginning of the companion log, does it stream the file from source to tus, or download it first? e.g. before the uploader.total.progress events, do you see this?
controller.get.provider.size need to download the whole file first
uploader.download fully downloading file
also are you running your companion with COMPANION_STREAMING_UPLOAD=true?
Edit: I can see your file is only about 100kb. Because the default chunk size for companion with tus-js-client is 50mb for “streaming uploads”, it leads me to believe that a 100kb file would be uploaded in a single PATCH, and you would either get a 0% progress or 100%, nothing in between. This means that your upload was indeed most likely fully downloaded (successfully) first, and after the file was downloaded, it gets uploaded to tus (using a File-based stream with a known length so no chunkSize needed). So there’s something that went wrong while the downloaded file was being uploaded to tus, causing it to get stuck in this 0 length PATCH request loop at 99.95%
We also set COMPANION_CHUNK_SIZE to 5242880 (5mb) and yes, I agree, it’s strange that it would stream part of this and not all. Every instance I saw was for Url uploads, so I wonder if the source url was reporting a length different than the total length.
Unless you have any evidence that counters it, I think your file has been fully downloaded first, before being uploaded to tus. Then it means that the server hasn’t reported an incorrect length because the file has been downloaded first. Background: even when COMPANION_STREAMING_UPLOAD=true, companion will still download the file to hard drive first if the URL is “chunked”, meaning it doesn’t report content-length to companion. This is because some upload destinations need to know the expected total file length first, so we have to download the file first to know the length. So I can only think of some bug happening while the file is being uploaded to tus, either tus-js-client or tusd.
getSize is ultimately the result of either a HEAD request that can return a Content-Length or a GET request (which I assume fetches the whole file)
I can’t tell whether or not the entire contents of the file have been downloaded before the tus server receives requests. I see from tus logs that the full size of the file is recorded as a result of the POST, and in the case of these hanging uploads, the upload stalls just short of the reported size.
I do wonder if in this case the url is reporting an erroneous Content-Length than the actual length. For example the HEAD request returns a length of 1000 but the body size is actually 990. It would result in a tus PATCH Content-Length=900 and Upload-Length=1000 – would it then keep trying to make subsequent PATCH requests since its not considered complete?
I thought about that, but the fact that tus chunk size is much larger than the file size, means that tus wouldn’t start sending PATCH requests until the stream has ended (all bytes received). I did try to setup a node http server script using node.js where I try to feed it more bytes than the reported content-length, however it doesn’t seem to make a difference. So I’m not sure how to reproduce this.