CoreDNS Security Hardening: Rebinding Protection, Plugin Configuration, and DNSSEC Forwarding
Problem
CoreDNS is the default DNS server in Kubernetes clusters, handling service discovery, pod name resolution, and external DNS forwarding. Every pod in the cluster sends DNS queries to CoreDNS. A misconfigured or compromised CoreDNS can redirect all cluster DNS traffic, enabling man-in-the-middle attacks on service-to-service communication.
Common security weaknesses:
- DNS rebinding via wildcard responses. CoreDNS’s default configuration forwards unknown queries to the host resolver. An attacker who controls an external domain can configure it to return internal cluster IP addresses in DNS responses, enabling a browser-based DNS rebinding attack that bypasses same-origin policy.
- No DNSSEC validation on forwarded queries. CoreDNS forwards external queries to upstream resolvers (typically
8.8.8.8or the node’s/etc/resolv.conf). Without DNSSEC validation, a forged DNS response from an on-path attacker can redirect cluster traffic to attacker-controlled endpoints. - Overpermissive Corefile plugins. The CoreDNS Corefile configures plugins. Plugins like
file(serve arbitrary zone files),auto(auto-load zone files from a directory), andtransfer(allow zone transfers) can be misconfigured to expose cluster DNS data to external parties. - Health and metrics endpoints exposed to pods. CoreDNS exposes
/healthand/readyHTTP endpoints plus Prometheus metrics on port 9153. Without network policy restrictions, any pod can query CoreDNS metrics, which reveal cluster service topology. - Excessive privileges for CoreDNS pods. CoreDNS pods in many clusters run with
NET_BIND_SERVICEcapability and no other restrictions — acceptable. But some distributions grant broader capabilities or disable seccomp, creating unnecessary attack surface. - No query logging or anomaly detection. Malware commonly uses DNS for command-and-control and data exfiltration (DNS tunnelling). Without query logging, exfiltration via CoreDNS is undetected.
Target systems: CoreDNS 1.11+ (Kubernetes 1.28+); Corefile plugin configuration; NodeLocal DNSCache for performance and isolation; NetworkPolicy for CoreDNS access control.
Threat Model
- Adversary 1 — DNS rebinding attack: A malicious pod resolves an attacker-controlled domain that returns a private IP address (e.g.,
10.96.0.1— the Kubernetes API server). The browser (or application) then sends HTTP requests to the Kubernetes API, bypassing same-origin policy. Without rebinding protection, CoreDNS returns private IPs for external domains. - Adversary 2 — DNS response poisoning: An on-path attacker between CoreDNS and the upstream resolver forges a DNS response, redirecting
api.external-service.comto an attacker IP. Without DNSSEC validation, CoreDNS caches and serves the poisoned response to all pods. - Adversary 3 — DNS tunnelling for data exfiltration: Malware on a compromised pod encodes exfiltrated data in DNS query subdomains (
exfildata.c2.attacker.com). CoreDNS forwards the query to the attacker’s authoritative DNS server, delivering the data. Without query monitoring, this is invisible. - Adversary 4 — CoreDNS zone transfer. The
transferplugin is misconfigured to allow transfers from any IP. An attacker performs a zone transfer and receives a complete map of all internal DNS names, revealing service topology. - Adversary 5 — Metric endpoint reconnaissance. Any pod queries CoreDNS metrics on port 9153 and receives a complete list of recently queried DNS names — revealing which external services the cluster depends on.
- Access level: Adversaries 1, 3, and 5 only need to run a pod in the cluster. Adversary 2 is on-path between CoreDNS and the upstream resolver. Adversary 4 needs network access to CoreDNS.
- Objective: Redirect service traffic, exfiltrate data, enumerate service topology, bypass network controls.
- Blast radius: Successful DNS poisoning for a high-value internal service affects every pod that queries that name — potentially all inter-service communication.
Configuration
Step 1: Corefile Hardening
# /etc/coredns/Corefile — hardened configuration.
# Cluster-internal DNS zone.
cluster.local:53 {
errors
health {
lameduck 5s
}
ready
kubernetes cluster.local in-addr.arpa ip6.arpa {
pods insecure # insecure: trust pod IP for PTR; use verified for stricter mode.
fallthrough in-addr.arpa ip6.arpa
ttl 30
}
prometheus :9153 # Metrics — restrict access via NetworkPolicy.
forward . /etc/resolv.conf {
max_concurrent 1000
}
cache 30 {
# Prevent negative caching poisoning.
denial 9984 30 # Cache NXDOMAIN for 30s max.
success 9984 30 # Cache successful responses for 30s max.
}
loop
reload
loadbalance
}
# External DNS forwarding with rebinding protection.
.:53 {
errors
# Rebinding protection: deny responses that return private IPs for external domains.
# The acl plugin blocks queries that would resolve to private ranges.
acl {
answer name .* ip 10.0.0.0/8 deny # Block private RFC1918 responses.
answer name .* ip 172.16.0.0/12 deny
answer name .* ip 192.168.0.0/16 deny
answer name .* ip 100.64.0.0/10 deny # CGNAT range.
allow
}
# Forward external queries to trusted resolvers with DNSSEC validation.
forward . tls://1.1.1.1 tls://1.0.0.1 {
tls_servername cloudflare-dns.com # Verify TLS certificate.
max_concurrent 1000
health_check 5s
expire 10s
}
cache 300 {
denial 9984 30
success 9984 300
}
# Log queries for DNS tunnelling detection.
log . {
class denial error
}
loop
loadbalance
}
Step 2: DNS-over-TLS for Upstream Forwarding
Plain UDP forwarding to upstream resolvers is unencrypted and susceptible to response forgery. Use DNS-over-TLS:
# CoreDNS forward with TLS to Cloudflare (1.1.1.1) and Google (8.8.8.8).
forward . tls://1.1.1.1 tls://8.8.8.8 {
tls_servername cloudflare-dns.com # Verify server certificate.
# This prevents on-path DNS response poisoning between CoreDNS and the resolver.
max_concurrent 1000
expire 30s
health_check 10s
}
# Apply via Kubernetes ConfigMap.
apiVersion: v1
kind: ConfigMap
metadata:
name: coredns
namespace: kube-system
data:
Corefile: |
.:53 {
forward . tls://1.1.1.1 tls://1.0.0.1 {
tls_servername cloudflare-dns.com
max_concurrent 1000
}
cache 300
errors
log
}
cluster.local:53 {
kubernetes cluster.local in-addr.arpa ip6.arpa {
pods insecure
fallthrough in-addr.arpa ip6.arpa
}
cache 30
errors
}
Step 3: NetworkPolicy for CoreDNS
Restrict which pods can query CoreDNS and who can reach its metrics:
# Allow all pods to query CoreDNS on UDP/TCP 53.
# Restrict metrics (9153) to monitoring namespace only.
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: coredns-access
namespace: kube-system
spec:
podSelector:
matchLabels:
k8s-app: kube-dns
policyTypes:
- Ingress
ingress:
# DNS queries from all pods (required for cluster function).
- ports:
- port: 53
protocol: UDP
- port: 53
protocol: TCP
# Metrics: only from monitoring namespace.
- from:
- namespaceSelector:
matchLabels:
kubernetes.io/metadata.name: monitoring
ports:
- port: 9153
protocol: TCP
# Health/ready probes: only from kube-system (kubelet).
- from:
- namespaceSelector:
matchLabels:
kubernetes.io/metadata.name: kube-system
ports:
- port: 8080
protocol: TCP
Step 4: CoreDNS Pod Security
# Harden the CoreDNS Deployment (patch via strategic merge).
apiVersion: apps/v1
kind: Deployment
metadata:
name: coredns
namespace: kube-system
spec:
template:
spec:
securityContext:
seccompProfile:
type: RuntimeDefault
containers:
- name: coredns
securityContext:
allowPrivilegeEscalation: false
readOnlyRootFilesystem: true
runAsNonRoot: true
runAsUser: 1000
capabilities:
drop: ["ALL"]
add: ["NET_BIND_SERVICE"] # Required to bind port 53.
resources:
limits:
cpu: "200m"
memory: "170Mi"
requests:
cpu: "100m"
memory: "70Mi"
# Liveness probe using health endpoint.
livenessProbe:
httpGet:
path: /health
port: 8080
scheme: HTTP
initialDelaySeconds: 60
timeoutSeconds: 5
successThreshold: 1
failureThreshold: 5
Step 5: NodeLocal DNSCache
NodeLocal DNSCache runs a DNS cache on every node, reducing CoreDNS load and improving query isolation — compromised pods on one node cannot observe DNS queries from pods on other nodes:
# Deploy NodeLocal DNSCache (DaemonSet).
# Each node runs a local cache on a link-local address (169.254.20.10).
# Pods query the local cache; cache queries CoreDNS for misses.
# Apply from the official manifest.
kubectl apply -f https://raw.githubusercontent.com/kubernetes/kubernetes/master/cluster/addons/dns/nodelocaldns/nodelocaldns.yaml
# Configure kubelet to use NodeLocal DNSCache.
# --cluster-dns=169.254.20.10 in kubelet config.
Step 6: Query Logging and DNS Tunnelling Detection
# Corefile: enable query logging for anomaly detection.
.:53 {
log . {
class all # Log all queries (high volume; consider sampling in large clusters).
}
# ...
}
# dns_tunnel_detector.py — detect DNS tunnelling patterns in CoreDNS logs.
import re
from collections import defaultdict
from datetime import datetime
# CoreDNS log format:
# [INFO] 10.244.0.5:12345 - 1234 "A IN exfil.c2.attacker.com. udp 45 false 512" NOERROR 0 0.001234s
LONG_SUBDOMAIN = re.compile(r'[a-f0-9]{20,}\.') # Long hex strings (common in tunnels).
def analyse_dns_log(log_line: str) -> bool:
"""Returns True if the query looks like DNS tunnelling."""
# Extract query name.
match = re.search(r'"[A-Z]+ IN ([^\s]+)', log_line)
if not match:
return False
qname = match.group(1).rstrip('.')
labels = qname.split('.')
# Indicators:
# 1. Very long subdomain label (data encoding).
if any(len(label) > 40 for label in labels):
return True
# 2. High-entropy subdomain (random-looking base32/hex).
if LONG_SUBDOMAIN.search(qname):
return True
# 3. Unusually high query count for same domain (frequency analysis).
# (Tracked separately in a counter.)
return False
Step 7: Restrict External DNS Plugins
Disable CoreDNS plugins not needed in production:
# Do NOT include these plugins unless explicitly required:
# - file: serves arbitrary zone files from disk (lateral movement if writable).
# - auto: auto-loads zone files from a directory (same risk).
# - transfer: allows zone transfers (topology disclosure).
# - rewrite: can redirect cluster DNS internally (misuse potential).
# - template: generates responses from Go templates (injection risk if misconfigured).
# Minimal safe plugin set for a standard Kubernetes cluster:
# errors, health, ready, kubernetes, prometheus, forward, cache, loop, reload, loadbalance
Step 8: Telemetry
coredns_dns_requests_total{server, zone, proto, type} counter
coredns_dns_responses_total{server, zone, rcode} counter
coredns_dns_request_duration_seconds{server, zone, type} histogram
coredns_forward_requests_total{to} counter
coredns_forward_healthcheck_failures_total{to} counter
coredns_cache_hits_total{server, type} counter
coredns_dns_do_requests_total{server} counter # DNSSEC queries
Alert on:
coredns_forward_healthcheck_failures_totalnon-zero — upstream resolver unreachable; external DNS resolution failing.coredns_dns_responses_total{rcode="SERVFAIL"}spike — CoreDNS failing to resolve; may indicate upstream issues or misconfiguration after a Corefile change.- Query log shows long hex-encoded subdomain labels — potential DNS tunnelling; investigate source pod.
- CoreDNS pod restart — unexpected restarts may indicate OOM from cache exhaustion or a Corefile syntax error after a reload.
coredns_dns_requests_totalsudden spike from a single source IP — DNS-based DDoS or rapid service discovery scanning.
Expected Behaviour
| Signal | Default CoreDNS | Hardened CoreDNS |
|---|---|---|
| DNS rebinding attack | Private IP returned for external domain | ACL plugin blocks private IPs in external responses |
| Upstream response poisoning | Unencrypted UDP; forged response accepted | DNS-over-TLS to upstream; MITM cannot forge responses |
| DNS tunnelling | Queries forwarded silently | Query logging enables detection; high-entropy domains flagged |
| Metrics exposure | Any pod reads cluster service topology | NetworkPolicy restricts metrics to monitoring namespace |
| Zone transfer | Transfer plugin allows read if misconfigured | Transfer plugin excluded from Corefile |
Trade-offs
| Aspect | Benefit | Cost | Mitigation |
|---|---|---|---|
| DNS-over-TLS upstream | Prevents response poisoning | Slightly higher latency (~10ms) than UDP; TLS handshake | NodeLocal DNSCache reduces TLS handshake frequency |
| ACL rebinding protection | Prevents private IP in external responses | May break split-horizon DNS where internal names resolve via external lookup | Explicitly allowlist internal resolver domains; use kubernetes plugin for cluster names |
| Full query logging | DNS tunnelling detection | High log volume in large clusters | Log only external forwarded queries; use sampling |
| NodeLocal DNSCache | Reduces cross-node DNS query observation | Additional DaemonSet to manage | Standard add-on; well-maintained |
Failure Modes
| Failure | Symptom | Detection | Recovery |
|---|---|---|---|
| Corefile syntax error after reload | CoreDNS crashes or refuses to reload | Pod restarts; DNS resolution fails cluster-wide | coredns -conf /etc/coredns/Corefile -plugins to validate before applying; roll back ConfigMap |
| Upstream TLS resolver unreachable | External DNS fails; internal still works | coredns_forward_healthcheck_failures_total; external service failures |
Add fallback resolver: forward . tls://1.1.1.1 tls://8.8.8.8 (multiple upstreams) |
| ACL blocks legitimate split-horizon | Internal service queried via external name fails | Service discovery failures for specific domains | Add except clause for internal domains in ACL; use CoreDNS file plugin for internal zones |
| NodeLocal DNSCache DaemonSet OOM | DNS fails on specific nodes | Node-level DNS failure; pod scheduling on that node fails DNS | Increase memory limits for NodeLocal DNSCache DaemonSet |