Grafana Alloy, Loki and SELinux

From Coolscript
Revision as of 14:24, 20 April 2026 by Admin (talk | contribs) (Created page with "= Grafana Alloy, Loki and SELinux = This article summarizes how to run Grafana Alloy on an SELinux-enabled Linux system, how to troubleshoot startup and access problems, and what to check when Alloy sends metrics to Mimir or logs to Loki. == Scope == The main SELinux topics for Alloy are: * execution of the Alloy binary * outbound network access to Mimir and Loki * reading local log files for Loki * reading the systemd journal for Loki * file ownership and runtime st...")
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

Grafana Alloy, Loki and SELinux

This article summarizes how to run Grafana Alloy on an SELinux-enabled Linux system, how to troubleshoot startup and access problems, and what to check when Alloy sends metrics to Mimir or logs to Loki.

Scope

The main SELinux topics for Alloy are:

  • execution of the Alloy binary
  • outbound network access to Mimir and Loki
  • reading local log files for Loki
  • reading the systemd journal for Loki
  • file ownership and runtime state when Alloy runs as a restricted user

Key Lessons

  • The Alloy binary at `/usr/local/bin/alloy` must have a valid SELinux label such as `bin_t`.
  • Do not use `SELinuxContext=` in the systemd unit.
  • If cAdvisor is local, scrape `127.0.0.1:8080` instead of a hostname that may not resolve.
  • If Alloy runs as a restricted user, it must be able to read the config and write its WAL.
  • For Loki, SELinux usually matters more for reading log sources than for sending data to the Loki endpoint.

Installation Basics

Binary

Install Alloy here:

/usr/local/bin/alloy

Check the SELinux label:

 ls -lZ /usr/local/bin/alloy
 

Expected type:

bin_t

If the label is wrong, fix it:

 restorecon -v /usr/local/bin/alloy
 

If that does not persist correctly, define the file context and relabel:

 semanage fcontext -a -t bin_t '/usr/local/bin/alloy'
 restorecon -v /usr/local/bin/alloy
 

Config

Typical config location:

/etc/alloy/config.alloy

Runtime Data

Prefer runtime state under:

/var/lib/alloy

Avoid keeping mutable runtime data under `/etc/alloy` when possible.

Systemd Unit

Use a normal unit file. Do not set `SELinuxContext=`.

Example:

 [Unit]
 Description=Grafana Alloy Service
 After=network.target

 [Service]
 Type=simple
 ExecStart=/usr/local/bin/alloy run /etc/alloy/config.alloy
 WorkingDirectory=/etc/alloy
 Restart=always
 RestartSec=5
 User=root
 LimitNOFILE=65535
 StandardOutput=syslog
 StandardError=syslog
 SyslogIdentifier=alloy
 Environment="test"
 Environment="APP_NAME=test"
 Environment="OrgID=anonymous"
 Environment="PATH=/usr/local/bin:/usr/bin:/bin"

 [Install]
 WantedBy=multi-user.target
 

Prometheus Scrape Example

If cAdvisor runs locally, use:

 prometheus.scrape "pan_cadvisor" {
   targets    = [{ __address__ = "127.0.0.1:8080", client = sys.env("CLIENT") }]
   forward_to = [prometheus.remote_write.hosting.receiver]
 }
 

This avoids hostname resolution problems such as failing to resolve `cadvisor`.

Common SELinux Problems

203/EXEC on startup

Symptom:

Main process exited, code=exited, status=203/EXEC

This usually means systemd could not execute the binary. A common cause is a wrong SELinux label on `/usr/local/bin/alloy`.

Example bad label:

admin_home_t

Check:

 ls -lZ /usr/local/bin/alloy
 

Fix:

 restorecon -v /usr/local/bin/alloy
 

Scrape fails with connect: permission denied

Symptom:

Get "http://127.0.0.1:8080/metrics": dial tcp 127.0.0.1:8080: connect: permission denied

If `curl` works from the shell but Alloy cannot connect, SELinux may be blocking the Alloy process.

Check denials:

 ausearch -m AVC -ts recent
 

If needed, generate and install a local policy:

 ausearch -m AVC -c alloy --raw | audit2allow -M alloy_local
 semodule -i alloy_local.pp
 

WAL permission denied

Symptom:

open data-alloy/prometheus.remote_write.hosting/wal/00000657: permission denied

This typically happens after switching Alloy from `root` to a restricted user while the WAL directory is still owned by `root`.

Immediate fix:

 chown -R alloy-user:alloy-user /etc/alloy/data-alloy
 chmod -R u+rwX /etc/alloy/data-alloy
 restorecon -Rv /etc/alloy/data-alloy
 

Also ensure config readability:

 chown root:alloy-user /etc/alloy
 chmod 750 /etc/alloy
 chmod 640 /etc/alloy/config.alloy
 

Running Alloy as a Restricted User

Create the user:

 useradd --system --no-create-home --shell /sbin/nologin alloy-user
 id alloy-user
 

Set the service to:

 User=alloy-user
 Group=alloy-user
 

Ensure:

  • `/usr/local/bin/alloy` is still labeled `bin_t`
  • `/etc/alloy` is readable by `alloy-user`
  • the WAL and runtime directories are writable by `alloy-user`

Preferred runtime directory:

 mkdir -p /var/lib/alloy
 chown -R alloy-user:alloy-user /var/lib/alloy
 chmod 750 /var/lib/alloy
 restorecon -Rv /var/lib/alloy
 

Returning Alloy to root

Either set:

 User=root
 Group=root
 

or remove both lines entirely, since systemd defaults to `root`.

If runtime directories were reassigned, give them back to root if needed:

 chown -R root:root /etc/alloy/data-alloy
 chown -R root:root /var/lib/alloy
 

Then reload and restart:

 systemctl daemon-reload
 systemctl restart alloy.service
 systemctl status alloy.service
 

Loki and SELinux

Sending data to Loki

For the Loki endpoint, SELinux checks the outbound network connection from the Alloy process.

If Alloy is running in `unconfined_service_t`, SELinux is usually not the main blocker for outbound HTTPS to Loki.

Check the process context:

 ps -eZ | grep alloy
 

Example output:

system_u:system_r:unconfined_service_t:s0 539778 ? 00:00:16 alloy

Important part:

unconfined_service_t

This means the process is not tightly confined. In that case, failures to send to Loki are more likely to be caused by:

  • DNS problems
  • TLS or certificate issues
  • authentication problems
  • proxy or firewall rules
  • a wrong Loki URL

Check for actual SELinux denials before changing policy:

 ausearch -m AVC -ts recent
 

If a denial shows `name_connect`, a local policy module may be required.

Reading plain log files for Loki

This is the most common SELinux issue when adding Loki.

Alloy must be able to:

  • traverse the parent directories
  • open the file
  • read the file

Check labels:

 ls -lZ /path/to/log-directory
 ls -lZ /path/to/logfile
 matchpathcon /path/to/logfile
 

If the file should behave like a regular log file, `var_log_t` is a common type.

Quick test for one file:

 chcon -t var_log_t /path/to/logfile
 

Important:

  • `chcon` changes the current live label immediately
  • `chcon` is usually not persistent
  • `restorecon` may revert it later

Preferred long-term fix for a custom log directory:

 semanage fcontext -a -t var_log_t '/path/to/log-directory(/.*)?'
 restorecon -Rv /path/to/log-directory
 

Reading the systemd journal

If Alloy reads the journal, verify both OS permissions and SELinux access.

Check:

 id alloy-user
 ls -lZ /var/log/journal
 ausearch -m AVC -ts recent
 

If Alloy runs as a restricted user, journal group access may be needed:

 usermod -aG systemd-journal alloy-user
 

Restart the service after changing group membership.

Reading container logs

If logs come from paths such as `/var/lib/docker/containers` or `/var/log/containers`, SELinux may be stricter because container paths often use special SELinux types.

Check:

 ls -lZ /var/lib/docker/containers
 ls -lZ /var/log/containers
 ausearch -m AVC -ts recent
 

In such cases, a local policy module may be required.

What `ps -eZ | grep alloy` tells you

Example:

system_u:system_r:unconfined_service_t:s0 539778 ? 00:00:16 alloy

Meaning:

  • `system_u` = SELinux user
  • `system_r` = SELinux role
  • `unconfined_service_t` = SELinux type or process domain
  • `s0` = SELinux level

The most important field is the type. If Alloy runs as `unconfined_service_t`, SELinux is generally less restrictive for this process than for a tightly confined domain.

Useful SELinux Commands

ls -lZ

Shows the SELinux label of a file or directory.

 ls -lZ /usr/local/bin/alloy
 

restorecon

Restores the default label according to policy.

 restorecon -v /usr/local/bin/alloy
 restorecon -Rv /var/lib/alloy
 

semanage

Defines persistent SELinux mappings, such as file contexts and port types.

 semanage fcontext -a -t bin_t '/usr/local/bin/alloy'
 semanage fcontext -a -t var_log_t '/path/to/log-directory(/.*)?'
 semanage port -l | grep 8080
 

semodule

Installs and lists policy modules.

 semodule -i alloy_local.pp
 semodule -l | grep alloy
 

ausearch

Searches the audit log for SELinux denials.

 ausearch -m AVC -ts recent
 ausearch -m AVC -c alloy --raw
 

getenforce and setenforce

Show or temporarily change SELinux mode.

 getenforce
 setenforce 0
 setenforce 1
 

Use `setenforce 0` only as a temporary test.

matchpathcon

Shows the expected SELinux label for a path.

 matchpathcon /usr/local/bin/alloy
 matchpathcon /path/to/logfile
 

audit2allow

Builds a local policy module from recorded denials.

 ausearch -m AVC -c alloy --raw | audit2allow -M alloy_local
 semodule -i alloy_local.pp
 

Review generated policy before using it in production.

Recommended Troubleshooting Flow

  1. Verify SELinux mode:
    • `getenforce`
    • `sestatus`
  2. Verify the Alloy binary label:
    • `ls -lZ /usr/local/bin/alloy`
    • `matchpathcon /usr/local/bin/alloy`
  3. Fix labels if needed:
    • `restorecon -v /usr/local/bin/alloy`
    • or use `semanage fcontext` plus `restorecon`
  4. Check the running process context:
    • `ps -eZ | grep alloy`
  5. Reproduce the problem and inspect denials:
    • `ausearch -m AVC -ts recent`
  6. For Loki file collection, inspect the file and directory labels.
  7. For journald collection, check journal permissions and group membership.
  8. If required, generate and install a local policy module.
  9. Restart Alloy and verify logs.

Final Recommendations

  • Keep `/usr/local/bin/alloy` labeled as `bin_t`.
  • Do not use `SELinuxContext=` in the systemd unit.
  • Prefer `/var/lib/alloy` for runtime state.
  • Use `127.0.0.1:8080` for a local cAdvisor target.
  • For Loki, first focus on SELinux access to the log source, not only the Loki endpoint.
  • Use `chcon` only for quick tests.
  • Use `semanage fcontext` plus `restorecon` for persistent label fixes.
  • Use `ausearch -m AVC -ts recent` before changing SELinux policy.