Structured Network ACL Design and Management
The Problem
Firewall policy has a natural entropy gradient. When the organization is small, someone adds a rule to allow the new SaaS integration, tickets a team member to open port 5432 from the staging subnet, and leaves a comment that says “temporary — remove after test.” Three years later that rule is still there. No one knows what broke when someone removed a similar rule in 2023, so they stopped removing rules. The ruleset grows to 400 entries. The last engineer who understood it left the company.
This is ACL sprawl, and it is the norm rather than the exception in environments more than two years old. The concrete security consequences:
- Implicit permit-any rules buried mid-list that allow broad traffic nobody intended to permit long-term.
- Shadowed rules that are syntactically valid but logically unreachable because an earlier, broader rule already matches the same traffic. The intent of the shadowed rule — perhaps a deny — is silently bypassed.
- Abandoned access paths from decommissioned systems: rules allowing traffic from IPs that no longer exist, or to ports on services that were retired. An attacker who gains that IP via DHCP or reassignment inherits the access.
- Missing return-traffic rules in stateless ACL environments that produce asymmetric routing or partial connectivity, leading to compensating rules that are broader than necessary.
- Compliance failure: when an auditor asks “show me the business justification for this permit rule,” the answer is silence, and the finding is a gap.
The fix is not a one-time cleanup. It is a structural approach to ACL design — one where rules are owned, documented, ordered, automated, and audited continuously.
Design Principles
Deny by Default
Every zone boundary starts with an implicit deny-all. Only explicitly permitted traffic flows. This is the baseline. Every router ACL and stateful firewall should end with an explicit deny all or drop rule (not just rely on an implicit one at the platform level) so the deny is visible, auditable, and logged.
Document the Business Justification
Every permit rule must answer three questions:
- What is the source (system, owner, subnet)?
- What is the destination (system, owner, port/protocol)?
- Why does this access exist (change ticket, business process)?
This information lives in the rule comment or in the IaC commit message that created it. A rule with no answer to these three questions is a candidate for removal.
Owner Assignment
Every permit rule has a named owner: a team, a service, or an individual. Ownership triggers review obligations. When a service is decommissioned, its owner must revoke all rules associated with it. When an owner leaves, rules are transferred or reviewed.
Time-Bound Temporary Rules
“Temporary” rules must not rely on human memory for removal. Every temporary rule is created with an automatic expiry mechanism — a systemd timer, a scheduler job, or a pipeline step. If the access is still needed at expiry, it must be re-justified and re-created, not left in place.
Stateful vs Stateless ACLs
Stateful (nftables/iptables conntrack)
Modern host and container firewalls use connection tracking. A stateful firewall remembers the state of a connection — SYN sent, established, related — and allows return traffic automatically without explicit rules.
# nftables stateful policy — return traffic handled by conntrack.
nft add table inet filter
nft add chain inet filter input { type filter hook input priority 0 \; policy drop \; }
# Accept established and related connections before any other check.
nft add rule inet filter input ct state established,related accept
# Accept new inbound on port 443 from any source.
nft add rule inet filter input tcp dport 443 ct state new accept
# Drop everything else (explicit).
nft add rule inet filter input drop
The ct state established,related accept rule at the top of the chain means the administrator writes only rules for new connections. Return packets are automatically accepted. This drastically reduces rule count and eliminates a class of asymmetric-policy bugs.
Stateless (Router ACLs, eBPF XDP, some hardware platforms)
Router ACLs on Cisco IOS, Juniper, and some hardware platforms are stateless: each packet is evaluated independently. A permit for outbound TCP 443 does NOT automatically permit the inbound SYN-ACK response. Return traffic rules must be explicit, and this is where mistakes happen.
Safe stateless return-traffic pattern: permit established TCP (ACK bit set) inbound rather than opening the full port bidirectionally:
! Cisco IOS ACL — stateless, requires explicit return-traffic rules.
ip access-list extended OUTBOUND_INTERNET
permit tcp 10.10.0.0 0.0.255.255 any eq 443 ! outbound HTTPS
permit tcp 10.10.0.0 0.0.255.255 any eq 80
deny ip any any log
ip access-list extended INBOUND_INTERNET
permit tcp any 10.10.0.0 0.0.255.255 established ! return traffic only (ACK set)
deny ip any any log
The established keyword matches packets with the ACK or RST flag set — valid return traffic from connections the internal host originated. It does not permit new inbound connections, which is the correct behaviour.
When stateless is necessary: hardware-accelerated forwarding paths (TCAM-based line cards), XDP programs that run before the kernel network stack, and environments where stateful tracking is not available at the required throughput. In these cases, the stateless pattern above is the minimum safe approach.
Zone-Based Firewall Model
Applying per-host ACLs at scale is operationally unsustainable. The zone-based model assigns hosts to trust zones and defines policy between zones. Rules are written at the zone level, not between individual IPs.
Standard zones for most environments:
| Zone | Trust | Examples |
|---|---|---|
| internet | 0 | External clients, CDN, SaaS integrations |
| dmz | 1 | Web servers, API gateways, reverse proxies |
| app | 2 | Backend application servers |
| data | 3 | Databases, caches, message queues |
| management | 4 | Bastion hosts, monitoring collectors, log aggregators |
| internal | 2 | Workstations, internal tools |
Traffic flows only along explicitly permitted zone-to-zone paths:
# zone-policy.yaml — defines permitted zone-to-zone flows.
# All unlisted combinations are implicitly denied.
zone_policies:
- from: internet
to: dmz
permit:
- proto: tcp
ports: [80, 443]
comment: "TICKET-1201: Public web and API traffic"
- from: dmz
to: app
permit:
- proto: tcp
ports: [8080, 8443]
comment: "TICKET-1201: Reverse proxy to app servers"
- from: app
to: data
permit:
- proto: tcp
ports: [5432]
comment: "TICKET-1205: App servers to PostgreSQL"
- proto: tcp
ports: [6379]
comment: "TICKET-1206: App servers to Redis"
- from: management
to: "*"
permit:
- proto: tcp
ports: [22]
comment: "TICKET-1100: SSH access from management zone only"
- proto: tcp
ports: [9090, 9100]
comment: "TICKET-1102: Prometheus scrape"
The audit benefit is significant. When a pen tester asks “can the DMZ reach the data tier directly?”, the answer is deterministic from the zone policy file — no zone-to-zone entry for dmz→data exists, so the answer is no, and you can point to the file.
Naming Conventions and Rule Metadata
Readable, consistent rule names are the first line of defense against sprawl. A rule named rule_47 tells an auditor nothing. A rule named PERMIT_WEBAPP_TO_POSTGRES_5432 is self-documenting.
Naming pattern: {ACTION}_{SOURCE_ZONE_OR_GROUP}_{DEST_ZONE_OR_GROUP}_{PORT_OR_SERVICE}
Examples:
PERMIT_DMZ_TO_APP_8443DENY_INTERNET_TO_DATA_ALLPERMIT_MGMT_TO_ALL_SSHPERMIT_MONITORING_TO_NODEEXPORTER_9100
nftables supports named sets and comments directly in the ruleset:
# Define named sets for object groups.
nft add set inet filter WEBAPP_SERVERS { type ipv4_addr \; elements = { 10.10.2.10, 10.10.2.11, 10.10.2.12 } \; }
nft add set inet filter POSTGRES_HOSTS { type ipv4_addr \; elements = { 10.10.3.20 } \; }
# Named rule with comment embedding the ticket number.
nft add rule inet filter forward \
ip saddr @WEBAPP_SERVERS \
ip daddr @POSTGRES_HOSTS \
tcp dport 5432 \
ct state new \
comment \"TICKET-1205: webapp to postgres\" \
accept
Object groups (named sets) reduce rule count dramatically: instead of one rule per source IP, a single rule references a set. Adding a new web server means updating the set definition, not adding a new firewall rule. Fewer rules means fewer auditing opportunities for errors to hide.
Rule Ordering and Performance
In stateless ACLs evaluated top-to-bottom, rule ordering is a performance and correctness concern:
-
Most-frequently-matched rules first. High-volume flows (established return traffic, common service ports) should be at the top of the list. In hardware TCAM, all rules evaluate in parallel so this matters less, but in software ACL evaluation it directly affects throughput.
-
More-specific rules before less-specific rules. A
/32host rule must appear before a/24subnet rule covering the same host, otherwise the specific rule is shadowed. -
Explicit deny before the implicit deny. The explicit deny at the end is logged; the implicit deny is not. You want to see traffic hitting the deny rule in your logs.
# Correct ordering in nftables: specific before general, established first.
nft add rule inet filter input position 0 ct state established,related accept # Always first.
nft add rule inet filter input ip saddr 192.0.2.5 tcp dport 22 accept # Specific host SSH.
nft add rule inet filter input ip saddr 10.0.0.0/8 tcp dport 22 drop # Block rest of /8.
nft add rule inet filter input tcp dport 443 accept # General HTTPS permit.
nft add rule inet filter input drop # Explicit final deny.
Automated ACL Auditing
Manual review of 400 firewall rules finds nothing. Automated auditing finds shadowed rules, permit-any entries, and rules without comments every time it runs.
Finding Shadowed Rules
A rule is shadowed when an earlier rule already matches all traffic the later rule would match, making the later rule unreachable. This is critical when the shadowed rule is a deny being overridden by a broader permit.
#!/usr/bin/env python3
# shadow-check.py — detect shadowed rules in a flat ACL list.
# Simplified for illustration; production versions handle protocol/mask logic.
import ipaddress
def parse_rule(line):
"""Parse 'permit/deny proto src dst dport' format."""
parts = line.strip().split()
if len(parts) < 5:
return None
return {
"action": parts[0],
"proto": parts[1],
"src": ipaddress.ip_network(parts[2], strict=False),
"dst": ipaddress.ip_network(parts[3], strict=False),
"dport": parts[4],
}
def is_subset(candidate, broader):
"""Return True if candidate is fully covered by broader rule."""
return (
broader["proto"] in (candidate["proto"], "any")
and candidate["src"].subnet_of(broader["src"])
and candidate["dst"].subnet_of(broader["dst"])
and broader["dport"] in (candidate["dport"], "any")
)
rules = []
with open("acl.txt") as f:
for line in f:
r = parse_rule(line)
if r:
rules.append((line.strip(), r))
for i, (line_i, rule_i) in enumerate(rules):
for line_j, rule_j in rules[:i]:
if is_subset(rule_i, rule_j):
print(f"SHADOWED: '{line_i}' is covered by earlier rule '{line_j}'")
Finding Permit-Any Rules
# nftables: dump ruleset and search for overly broad permits.
nft list ruleset | grep -E 'accept' | grep -v 'ct state established' | grep -v 'dport'
# iptables: find rules with no dport restriction that accept.
iptables -L -n -v | awk '/ACCEPT/ && !/dpt:/ && !/state/'
Commercial and Open-Source Tools
- fwaudit (open source): parses iptables/nftables rulesets, identifies shadowed rules, redundant rules, and permit-any entries.
- Tufin SecureTrack / Firewall Analyzer (ManageEngine): commercial platforms with change tracking, compliance mapping, and rule-use statistics. Rule-use statistics identify rules that have matched zero packets in 90 days — candidates for removal.
- algosec FireFlow: workflow-based change management with automated policy analysis.
Infrastructure as Code for ACL Management
ACL changes that bypass version control are the primary source of configuration drift. Every firewall change must go through a code review, exactly like application code.
nftables via Ansible
# ansible/roles/firewall/tasks/main.yml
---
- name: Deploy nftables configuration
template:
src: nftables.conf.j2
dest: /etc/nftables.conf
owner: root
group: root
mode: "0640"
notify: reload nftables
- name: Validate nftables config before apply
command: nft -c -f /etc/nftables.conf
changed_when: false
check_mode: no
{# ansible/roles/firewall/templates/nftables.conf.j2 #}
#!/usr/sbin/nft -f
# Managed by Ansible — do not edit manually.
# Last deployed: {{ ansible_date_time.iso8601 }}
flush ruleset
table inet filter {
set WEBAPP_SERVERS {
type ipv4_addr
elements = { {% for host in webapp_servers %}{{ host }}{% if not loop.last %}, {% endif %}{% endfor %} }
}
chain input {
type filter hook input priority 0; policy drop;
ct state established,related accept comment "Return traffic"
ct state invalid drop comment "Drop invalid conntrack state"
{% for rule in firewall_rules.inbound %}
{{ rule.proto }} dport {{ rule.port }} {{ rule.action }} comment "{{ rule.ticket }}: {{ rule.description }}"
{% endfor %}
drop
}
chain forward {
type filter hook forward priority 0; policy drop;
ct state established,related accept
{% for rule in firewall_rules.forward %}
ip saddr {{ rule.src }} ip daddr {{ rule.dst }} {{ rule.proto }} dport {{ rule.port }} {{ rule.action }} comment "{{ rule.ticket }}: {{ rule.description }}"
{% endfor %}
drop
}
}
GitOps Pipeline for Firewall Policy
Every ACL change goes through a pull request:
- Engineer creates PR with rule change in
firewall_rules.yml - CI pipeline runs
nft -c -fsyntax validation and the shadow-check script - CI runs policy compliance check against the CIS benchmark rules
- Security team reviews the PR; ticket number in the commit message links to the change request
- Merge triggers Ansible apply via the CD pipeline
- Post-apply, the pipeline runs a connectivity test to verify intended paths and runs
nft list rulesetto confirm the deployed config matches the rendered template
# .github/workflows/firewall-check.yml
name: Firewall Policy Validation
on:
pull_request:
paths:
- 'ansible/roles/firewall/**'
jobs:
validate:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Render nftables config
run: ansible -i localhost, -c local -m template ...
- name: Syntax check
run: nft -c -f rendered/nftables.conf
- name: Shadow rule check
run: python3 scripts/shadow-check.py rendered/acl-flat.txt
- name: Check for permit-any rules
run: |
if grep -E 'accept$' rendered/nftables.conf | grep -v 'established\|dport\|icmp'; then
echo "ERROR: Unrestricted accept rule found"
exit 1
fi
Temporary Access Management
Temporary rules require a removal mechanism at creation time. Two patterns work in practice:
systemd Timer for Rule Expiry
# Create a temporary rule and schedule its removal.
RULE_ID="temp-$(date +%s)"
EXPIRY="2026-05-14 18:00:00"
# Add the rule with a named handle for later removal.
nft add rule inet filter input \
ip saddr 203.0.113.45 tcp dport 22 accept \
comment "TICKET-9901: contractor SSH access, expires ${EXPIRY}"
# Capture the rule handle.
HANDLE=$(nft -a list chain inet filter input | grep "TICKET-9901" | awk '{print $NF}')
# Create a one-shot systemd timer to remove the rule.
cat > /etc/systemd/system/remove-${RULE_ID}.service <<EOF
[Unit]
Description=Remove temporary firewall rule TICKET-9901
[Service]
Type=oneshot
ExecStart=/usr/sbin/nft delete rule inet filter input handle ${HANDLE}
ExecStartPost=/bin/rm /etc/systemd/system/remove-${RULE_ID}.service
ExecStartPost=/bin/rm /etc/systemd/system/remove-${RULE_ID}.timer
EOF
systemd-run --on-calendar="${EXPIRY}" \
--unit="remove-${RULE_ID}" \
/usr/sbin/nft delete rule inet filter input handle ${HANDLE}
Pipeline-Enforced Expiry via IaC
In the IaC model, temporary rules carry an expires field. A daily pipeline job removes rules whose expiry date has passed:
# firewall_rules.yml
forward:
- src: 10.10.0.0/24
dst: 10.10.3.20
proto: tcp
port: 5432
action: accept
ticket: TICKET-1205
description: "webapp to postgres"
owner: platform-team
expires: null # Permanent.
- src: 203.0.113.45/32
dst: 10.10.1.5
proto: tcp
port: 22
action: accept
ticket: TICKET-9901
description: "contractor SSH — penetration test engagement"
owner: security-team
expires: "2026-05-14" # Pipeline removes this after this date.
#!/usr/bin/env python3
# check-expiry.py — run in CI daily; fails if expired rules exist.
import yaml
from datetime import date
with open("firewall_rules.yml") as f:
rules = yaml.safe_load(f)
today = date.today()
expired = []
for rule in rules.get("forward", []) + rules.get("inbound", []):
if rule.get("expires") and date.fromisoformat(rule["expires"]) < today:
expired.append(rule)
if expired:
for r in expired:
print(f"EXPIRED RULE: {r['ticket']} — {r['description']} (expired {r['expires']})")
raise SystemExit(1)
Compliance Audit Readiness
CIS Firewall Benchmarks
The CIS Benchmarks for firewall policy include requirements directly mapped to the practices above:
| CIS Control | Practice |
|---|---|
| Ensure default deny policy | Explicit drop at end of every chain |
| Ensure rules have descriptions | Comments with ticket numbers in every rule |
| Ensure no permit-any rules | Automated check in CI pipeline |
| Ensure firewall changes are change-controlled | GitOps PR-based workflow |
| Ensure logging on deny rules | log statement on final drop rule |
# Add logging to the final deny rule.
nft add rule inet filter input \
limit rate 10/minute \
log prefix "nft:input:drop: " flags all \
drop
The rate limit on the log statement prevents log flooding from port scans.
Generating Audit Evidence
# Export current ruleset as point-in-time evidence.
DATE=$(date +%Y%m%d)
nft list ruleset > /var/log/firewall-audit/ruleset-${DATE}.txt
git -C /etc/firewall log --oneline --since="90 days ago" > /var/log/firewall-audit/changes-${DATE}.txt
# Cross-reference deployed rules against the IaC source.
ansible -i inventory/ firewall-hosts -m command \
-a "nft list ruleset" > /var/log/firewall-audit/deployed-${DATE}.txt
diff <(nft list ruleset) <(ansible-playbook render-only.yml --check 2>&1 | grep -A1 "nftables") \
&& echo "PASS: deployed config matches IaC source" \
|| echo "FAIL: drift detected — investigate before audit"
For PCI-DSS Section 1 and SOC 2 CC6.6 requirements, the combination of:
- Version-controlled rule history with PR comments linking to tickets
- Automated expiry of temporary rules
- Owner field on every rule
- Daily shadow-rule and permit-any checks in CI
provides direct evidence that firewall policy is actively managed, not just deployed and forgotten.
Implementation Checklist
- [ ] Define security zones; document trust levels and permitted zone-to-zone flows
- [ ] Audit existing ruleset: run shadow-check, find permit-any, identify rules older than one year with no ticket reference
- [ ] Add owner and ticket fields to all surviving rules; schedule review for any without justification
- [ ] Migrate ACL definitions into version control; establish PR-based change workflow
- [ ] Add syntax validation and shadow-check to CI pipeline
- [ ] Convert temporary rules to time-bounded entries with expiry pipeline enforcement
- [ ] Enable logging on the final deny rule with rate limiting
- [ ] Schedule quarterly rule-use audits: any rule with zero matches in 90 days is a removal candidate
- [ ] Map ruleset to applicable CIS controls and generate evidence package for next audit
ACL sprawl is the result of treating firewall rules as operational state rather than configuration code. The practices here — zone models, IaC, automated auditing, and expiry enforcement — make firewall policy subject to the same engineering discipline as the software it protects.