Upload completed, filesize matches, but SHA-256 hash mismatch — video file corrupted

Hi all,

I’m using tusd behind a Cloudflare proxy. Uploads seem to complete — the file size exactly matches the original file on the client. However, after upload:

  • The resulting file has a different SHA-256 hash
  • The uploaded video is corrupted

Here is an excerpt from the tusd logs during the problematic upload:

1750237625900	[tusd] 2025/06/18 09:07:05.900871 event="ResponseOutgoing" status="500" method="PATCH" path="/api/v1/tus/upload/e006bd335374fbf6c26c3e492e827683" error="read tcp: connection reset by peer" requestId="" 
1750237625900	[tusd] 2025/06/18 09:07:05.900837 event="ChunkWriteComplete" id="e006bd335374fbf6c26c3e492e827683" bytesWritten="8527872" 
1750237625900	[tusd] 2025/06/18 09:07:05.900815 event="BodyReadError" id="e006bd335374fbf6c26c3e492e827683" error="read tcp 10.93.30.126:3001->10.93.30.27:62392: read: connection reset by peer" 
1750237622826	[tusd] 2025/06/18 09:07:02.826004 event="ChunkWriteStart" id="e006bd335374fbf6c26c3e492e827683" maxSize="16777216" offset="2349973504"

My questions:

  1. How could the final file have the correct size but still be corrupted / wrong hash?
  2. Does tusd flush incomplete chunk writes even on a BodyReadError?
  3. What can I do to ensure data integrity if a PATCH request is interrupted?
  4. Is there a way to prevent tusd from finalizing the upload if there’s a read error or TCP reset mid-chunk?

Some context:

Any ideas or suggestions?

Thanks!

Hello there,

the TCP reset is not necessarily a problem. It just indicates that the upload got interrupted at some point, but resumed properly afterwards (that’s what resumable uploads are for). The presence of such errors in tusd’s log usually don’t indicate a problem as the upload procedure will recover from them.

Regarding the mismatching checksums, I haven’t experienced that on my own. Is this a frequent problem for you? Is it somewhat reproducible?

You mentioned the use of a shared storage. Is it possible that data got corrupted on there? If it’s a shared network disk and sticky sessions don’t work properly, the storage access from the different instances could collide with each other.

The tus protocol has methods for exchanging checksums, so the client and/or server can verify the integrity of the uploaded data, but tusd does currently not implement them. We hope to improve support for this in the future.

Hope that helps!

Hi, thanks for the clarification.

I’m not entirely sure whether this is due to a sticky session issue or a shared backend conflict — we don’t currently log which tusd instance is handling each request, so I can’t confirm if requests are routed inconsistently.

However, I did grep through our logs and found repeated mismatched offset errors for the same upload ID:

grep "error=" full_log.log
1750239546208	[tusd] 2025/06/18 09:39:06.208923 event="ResponseOutgoing" status="409" method="PATCH" path="/api/v1/tus/upload/e006bd335374fbf6c26c3e492e827683" error="mismatched offset" requestId="" 
1750238685207	[tusd] 2025/06/18 09:24:45.207121 event="ResponseOutgoing" status="409" method="PATCH" path="/api/v1/tus/upload/e006bd335374fbf6c26c3e492e827683" error="mismatched offset" requestId="" 
1750237645615	[tusd] 2025/06/18 09:07:25.615546 event="ResponseOutgoing" status="409" method="PATCH" path="/api/v1/tus/upload/e006bd335374fbf6c26c3e492e827683" error="mismatched offset" requestId="" 
1750237633831	[tusd] 2025/06/18 09:07:13.831126 event="ResponseOutgoing" status="409" method="PATCH" path="/api/v1/tus/upload/e006bd335374fbf6c26c3e492e827683" error="mismatched offset" requestId="" 
1750237625900	[tusd] 2025/06/18 09:07:05.900871 event="ResponseOutgoing" status="500" method="PATCH" path="/api/v1/tus/upload/e006bd335374fbf6c26c3e492e827683" error="read tcp: connection reset by peer" requestId="" 
1750237625900	[tusd] 2025/06/18 09:07:05.900815 event="BodyReadError" id="e006bd335374fbf6c26c3e492e827683" error="read tcp 10.93.30.126:3001->10.93.30.27:62392: read: connection reset by peer"

The first read tcp: connection reset by peer seems explainable (possibly a client disconnect), but the subsequent 409 mismatched offset errors look suspicious — they occurred several minutes apart during the same upload session.

Could this suggest a race condition or desync between multiple tusd instances accessing the same upload resource (file) — possibly caused by inconsistent routing or access to a shared disk?

Do you have any recommendation for logging or guarding against this kind of situation?

Yes, if requests are routed to different instances and the tusd instances are not synchronized via a distributed locking mechanism, such issues can appear. I recommend you to read Upload locks | tusd documentation, which explains this in detail.

Of course, 409 can also be triggered if the client is misbehaving and not resuming correctly after interruptions (for example, by not fetching the new offset with a HEAD request first).

1 Like

Yes, that’s what I’m trying to confirm before moving forward. I want to make sure sticky session misrouting isn’t the root cause.

I’m planning to remove the shared backend altogether — so that each tusd instance has its own storage directory. That should technically eliminate the need for distributed locking, correct?

If the sticky sessions work properly and reliably, yes, then there is no need for a distributed lock. However, when cookie-based stickyness is used, it might not work with clients that ignore cookies (esp. non-browser clients). If requests don’t get routed properly, you will see 404 errors where requests are routed to tusd instance that don’t have the corresponding upload on their local disk.

1 Like

I found something interesting: two ChunkWriteStart events in a row without a ChunkWriteComplete in between. Client did a HEAD, saw the new offset, and started a new PATCH before the last chunk finished — classic race condition.

1750237644875	[tusd] 2025/06/18 09:07:24.875680 event="ChunkWriteComplete" id="e006bd335374fbf6c26c3e492e827683" bytesWritten="16777216" 
1750237634858	[tusd] 2025/06/18 09:07:14.858293 event="ChunkWriteStart" id="e006bd335374fbf6c26c3e492e827683" maxSize="16777216" offset="2374115328" 
1750237633070	[tusd] 2025/06/18 09:07:13.070585 event="ChunkWriteComplete" id="e006bd335374fbf6c26c3e492e827683" bytesWritten="16777216" 
1750237625900	[tusd] 2025/06/18 09:07:05.900837 event="ChunkWriteComplete" id="e006bd335374fbf6c26c3e492e827683" bytesWritten="8527872" 
1750237622826	[tusd] 2025/06/18 09:07:02.826004 event="ChunkWriteStart" id="e006bd335374fbf6c26c3e492e827683" maxSize="16777216" offset="2349973504" 
1750237621063	[tusd] 2025/06/18 09:07:01.063648 event="ChunkWriteStart" id="e006bd335374fbf6c26c3e492e827683" maxSize="16777216" offset="2348810240" 
1750237620305	[tusd] 2025/06/18 09:07:00.305307 event="ChunkWriteComplete" id="e006bd335374fbf6c26c3e492e827683" bytesWritten="16777216" 
1750237610368	[tusd] 2025/06/18 09:06:50.368276 event="ChunkWriteStart" id="e006bd335374fbf6c26c3e492e827683" maxSize="16777216" offset="2332033024"

I think we need lock upload resource event when using sticky session.

P/s: thanks much for your quickly responses. :heart_eyes:

Yes, I think in that case upload locks would definitely be advisable. Please let us know if this improves or solves your problem, so we can adjust the advice we give to people in similar situations.