Why checksum ? It is not necessary


#1

Why have checksum in tus ? There is already checksums on the IP-layer. That means you do not have to add it to tus (no bytes will every be corrupted).

/ Magnus


#2

That’s a good question! Some situations where it could be useful are highlighted in this discussion: https://github.com/tus/tus-resumable-upload-protocol/issues/7. Especially the google engineers there were strong advocates for having checksums in a higher level.

A few quotes for convenience:

File corruption is insidious, and sneaks in everywhere. I wish we had an end-to-end default checksum algorithm we could use

worst case noisy lines you might see a 2±bit error (all single bit errors will be caught by the TCP checksum) once in 10**12 bytes (1 terabyte transferred) on a raw socket, so we added an extra 32-bit checksum per binary message (~100-1000 bytes, multiple messages per TCP packet, and they can cross packet boundaries).
We figured this pushed the error rate out to something close to 10**40 which is my unofficial “ain’t never gonna happen” figure (c.f. age of the universe in seconds).

checksums should also proof useful for catching bugs in the software [implementations]

One situation that I can see it fixing is the situation of someone resuming a file upload of a file that has changed. There is a few ways that can be resolved however without checksums one would never know the data had changed and the file would/could end up corrupted as it attempts to resume.