SELinux Without Fear: Custom Policies for Critical Services
From audit2allow forensics to versioned policy modules running in production, without falling into permanent permissive mode.
Every time a service breaks on RHEL or Rocky, the on-call reflex is the same: setenforce 0, problem solved, ticket closed. Six months later the entire cluster runs in permissive, nobody remembers why, and the compliance report becomes science fiction. The Basilisk OffSec team has spent the last two years taking down environments like that in authorized red teams, and the conclusion is blunt: disabled SELinux shortens the path from a PHP-FPM RCE to root in under 90 seconds. This article does not sell magic, it shows the workflow we use to ship versioned .pp modules in production without breaking deploys.
Before writing any policy you have to read what already exists. The seinfo -t command lists roughly 5,000 types on stock RHEL 9, and sesearch --allow -s httpd_t shows exactly what Apache can touch. We start every investigation with ausearch -m AVC -ts recent running in parallel with the service in TEMPORARY permissive mode, never the whole system. Use semanage permissive -a httpd_t to isolate only the domain under investigation. That pattern saved entire audits we describe in Linux Server Hardening: Applying CIS Benchmark Without Breaking Production, where CIS demands global enforcing but allows documented exceptions per specific domain.
audit2allow is a double-edged knife. Running ausearch -m AVC | audit2allow -M mymodule generates a .te that compiles and works, but frequently grants absurd permissions like allow httpd_t shadow_t:file read. Our internal checklist requires every .te to go through manual review before semodule -i. Hunt for rules touching shadow_t, etc_t, kernel_t or self:capability sys_admin, those are red flags. In a recent investigation, partially described in DFIR on Linux: Live Triage with UAC and Velociraptor, a blindly generated module had opened access to /etc/sudoers and nobody noticed for eight months.
For new services we prefer writing policy from scratch using the refpolicy macro language. A typical module has three files: myservice.te with the rules, myservice.fc with file contexts and myservice.if with interfaces for other domains. The make -f /usr/share/selinux/devel/Makefile generates the .pp that you install with semodule -i. We version those three files in git alongside Ansible, and every PR runs checkmodule -M -m and sediff against the baseline. That process connects to what we show in Supply Chain Security: Sigstore Signing and Real SBOMs in CI/CD, where the .pp ends up signed with Sigstore before reaching the nodes.
Services opening sockets on non-standard ports are the most common case of silent breakage. Postgres on 5433 for example needs semanage port -a -t postgresql_port_t -p tcp 5433, not a new policy. Nginx serving files outside /var/www wants semanage fcontext -a -t httpd_sys_content_t "/srv/app(/.*)?" followed by restorecon -Rv /srv/app. Eighty percent of the cases we see are file context and port labeling, not new policy. The team that understands that distinction saves hours, and the subject comes back when we discuss internal pivoting in Advanced Nmap: NSE Scripts for Internal Recon in a Simulated Corporate Lab.
Production maintenance demands observability. We configure setroubleshoot-server in silent mode forwarding AVCs to the SIEM via journald, with Sigma rules tuned for unexpected denials. When a deploy breaks, the alert arrives before the user complains. We also run sealert -a /var/log/audit/audit.log weekly in staging to catch policy drift. That continuous feedback loop is exactly what we defend in Purple Team in Practice: Building a Red vs Blue Feedback Loop: red team finds the path, blue team writes the policy, and the policy goes into the pipeline.
Practical takeaway: never run setenforce 0 in production for more than 15 minutes. Use semanage permissive -a domain_t to isolate the problem, capture AVCs with ausearch, review audit2allow output manually, commit the .te in git, sign the .pp and deploy through Ansible. If you cannot justify every allow rule in code review, the policy is not ready. SELinux is not an obstacle, it is the last perimeter after the attacker is already inside.