Ollama Systemd Unit File Sample

From Coolscript
Revision as of 16:29, 30 April 2026 by Admin (talk | contribs) (Created page with "== Ollama systemd override (annotated) == === Resource limits === <code> MemoryMax=16G # hard RAM cap (kills process if exceeded)<br> MemoryHigh=14G # soft limit (throttling before hard cap)<br> CPUQuota=400% # limit to ~4 CPU cores </code> === File descriptors === <code> LimitNOFILE=1048576 # prevent "too many open files" </code> === Restart / stability === <code> Restart=on-failure # restart only on crashes<br> RestartSec=3 # delay before restart<br> StartLimitInter...")
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

Ollama systemd override (annotated)

Resource limits

MemoryMax=16G # hard RAM cap (kills process if exceeded)
MemoryHigh=14G # soft limit (throttling before hard cap)
CPUQuota=400% # limit to ~4 CPU cores

File descriptors

LimitNOFILE=1048576 # prevent "too many open files"

Restart / stability

Restart=on-failure # restart only on crashes
RestartSec=3 # delay before restart
StartLimitIntervalSec=60
StartLimitBurst=5 # avoid infinite restart loops

IO / disk behavior

IOSchedulingClass=best-effort
IOSchedulingPriority=4

Optional CPU pinning

CPUAffinity=0 1 2 3 # bind to specific cores

Security (light sandboxing)

NoNewPrivileges=true
PrivateTmp=true
ProtectSystem=full
ProtectHome=true

Logging

StandardOutput=journal
StandardError=journal
LogRateLimitIntervalSec=0

Ollama tuning

Environment="OLLAMA_NUM_PARALLEL=2" # concurrent requests
Environment="OLLAMA_MAX_LOADED_MODELS=1" # avoid VRAM exhaustion
Environment="OLLAMA_KEEP_ALIVE=5m" # unload after idle

Environment="OLLAMA_HOST=127.0.0.1" # bind local only

Minimal working example

[Service]
MemoryMax=16G
CPUQuota=400%
LimitNOFILE=1048576
Restart=on-failure
RestartSec=3
Environment="OLLAMA_NUM_PARALLEL=2"
Environment="OLLAMA_MAX_LOADED_MODELS=1"