The behavior that the client sees is a very long lived request to upload the file, the % upload complete will stall out around 50-70% through, and the tusd service returns a 500.
The file is 221MB that I am attempting to upload. From what I can tell, it seems like tusd is building the file up in memory, instead of streaming it to Azure Storage.
This is what my Dockerfile looks like:
FROM tusproject/tusd:v2.0
ARG AZURE_STORAGE_ACCOUNT
ARG AZURE_STORAGE_KEY
ENV AZURE_STORAGE_ACCOUNT=${AZURE_STORAGE_ACCOUNT}
ENV AZURE_STORAGE_KEY=${AZURE_STORAGE_KEY}
I’m currently running it on an Azure App Service set to Standard S1 size, which means 1.75GB of memory.
I uploaded a single 221MB file, and the memory usage spiked up to 320MB during the upload. Is this expected for Azure uploads? What if multiple uploads are happening at the same time?
The memory usage is indeed troubling and not expected. We are not using the Azure storage in production, so I don’t have any operational experience with it.
Currently, our tusd servers use file-based storage directly and since deploying tusd v2 we see rare spikes of higher memory usage (“rare” means compared to the number of uploads these servers are handling):
This makes me wonder if this is a storage-independent issue. Could you try using the filestore and see if they same memory usage pattern appears as when using Azure? That would help tracking down the source of this.
I tried enabling -expose-pprof and see the following at /debug/pprof
/debug/pprof/
Set debug=1 as a query parameter to export in legacy text format
Types of profiles available:
Count Profile
42 allocs
0 block
0 cmdline
14 goroutine
42 heap
0 mutex
0 profile
8 threadcreate
0 trace
full goroutine stack dump
Profile Descriptions:
allocs: A sampling of all past memory allocations
block: Stack traces that led to blocking on synchronization primitives
cmdline: The command line invocation of the current program
goroutine: Stack traces of all current goroutines. Use debug=2 as a query parameter to export in the same format as an unrecovered panic.
heap: A sampling of memory allocations of live objects. You can specify the gc GET parameter to run GC before taking the heap sample.
mutex: Stack traces of holders of contended mutexes
profile: CPU profile. You can specify the duration in the seconds GET parameter. After you get the profile file, use the go tool pprof command to investigate the profile.
threadcreate: Stack traces that led to the creation of new OS threads
trace: A trace of execution of the current program. You can specify the duration in the seconds GET parameter. After you get the trace file, use the go tool trace command to investigate the trace.
But all of the links on that page result in a 404, except for cmdline, profile, and trace.