Ollama Systemd Unit File Sample
Ollama systemd override (annotated)
Resource limits
MemoryMax=16G # hard RAM cap (kills process if exceeded)
MemoryHigh=14G # soft limit (throttling before hard cap)
CPUQuota=400% # limit to ~4 CPU cores
File descriptors
LimitNOFILE=1048576 # prevent "too many open files"
Restart / stability
Restart=on-failure # restart only on crashes
RestartSec=3 # delay before restart
StartLimitIntervalSec=60
StartLimitBurst=5 # avoid infinite restart loops
IO / disk behavior
IOSchedulingClass=best-effort
IOSchedulingPriority=4
Optional CPU pinning
CPUAffinity=0 1 2 3 # bind to specific cores
Security (light sandboxing)
NoNewPrivileges=true
PrivateTmp=true
ProtectSystem=full
ProtectHome=true
Logging
StandardOutput=journal
StandardError=journal
LogRateLimitIntervalSec=0
Ollama tuning
Environment="OLLAMA_NUM_PARALLEL=2" # concurrent requests
Environment="OLLAMA_MAX_LOADED_MODELS=1" # avoid VRAM exhaustion
Environment="OLLAMA_KEEP_ALIVE=5m" # unload after idle
Environment="OLLAMA_HOST=127.0.0.1" # bind local only
Minimal working example
[Service]
MemoryMax=16G
CPUQuota=400%
LimitNOFILE=1048576
Restart=on-failure
RestartSec=3
Environment="OLLAMA_NUM_PARALLEL=2"
Environment="OLLAMA_MAX_LOADED_MODELS=1"