newspaint

Documenting Problems That Were Difficult To Find The Answer To

Hairpin for LXC Containers Using IPTables

Only recently when getting involved with Kubernetes did I discover the term “hairpinning“. It describes something I’d always wanted to do but did not know how.

Let’s say you have a host that has two LXC containers running on it, one of those LXC containers is your e-mail server, the other is a web server:

Example Network That Needs Hairpinning

Example Network That Needs Hairpinning

You might have an IPTables configuration file like the following (if you’re using Ubuntu and have the netfilter-persistent and iptables-persistent packages installed):

###############################################################################
# iptables IPv4 rules for reload on start-up
#
# To Reload:   
#   service netfilter-persistent restart
#
# To Test:
#   iptables-restore -t </etc/iptables/rules.v4
###############################################################################

*nat
:PREROUTING ACCEPT [0:0]
:POSTROUTING ACCEPT [0:0]
:OUTPUT ACCEPT [0:0]

# predefine chains
:nat_incoming_tcp - [0:0]

# redirect incoming TCP traffic to chain to NAT to LXC container
-A PREROUTING -i eth+ -p tcp -j nat_incoming_tcp

# NAT for LXC container "email" at 10.0.5.14
-A nat_incoming_tcp -p tcp --dport 25 -j DNAT --to-destination 10.0.5.14:25 -m comment --comment "SMTP"

# NAT for LXC container "web" at 10.0.5.15
-A nat_incoming_tcp -p tcp --dport 80 -j DNAT --to-destination 10.0.5.15:80 -m comment --comment "HTTP"

COMMIT

So far so good – if traffic from the Internet hits your host on 203.0.113.44 port 25 it will be DNAT’d (destination NAT’d) to the email LXC container. And if traffic from the Internet hits your host on 203.0.113.44 port 80 it will be DNAT’d to the web container.

Now – what happens if your web server wants to send an e-mail to yourself? You could program your web server to send all e-mails to 10.0.5.14, the IP address of the LXC container.

However what if your web server wants to send you an e-mail at your external Internet IP address (203.0.113.44)? It might want to do this because it automatically looked up your domain name and was told to contact your external IP address. It’s a problem. If your container tried to contact your external IP address it would send the packet out the lxcbr0 interface (default gateway) and then it might be sent out to the Internet by the server’s default gateway (eth0) – but it would more likely be gobbled as a Martian packet.

The answer is “hairpinning”. The following diagram illustrates the packet flow from a LXC container to another via the external IP address:

Using Hairpin to DNAT Internal Packet to External IP

Using Hairpin to DNAT Internal Packet to External IP

By adding some IPTables rules to the NAT table we can ensure that not only are packets destined for the external IP address NAT’d but they are also masqueraded so that replies follow a healthy path. Note that masquerading can only occur at the POSTROUTING step – so we have to mark the packets coming in from lxcbr0 destined for the external IP address in the PREROUTING step.

You might ask – why do hairpinned packets need to be masqueraded as they go back through the lxcbr0 interface? Consider the “from” address of a packet that gets hairpinned. A packet from the web server would have a “from” address of 10.0.5.15 and a “to” address of 203.0.113.44 initially. After the hairpin the packet would still have a “from” address of 10.0.5.15 and a “to” address of 10.0.5.14. The mail server replies “to” 10.0.5.15 “from” 10.0.5.14. This packet correctly returns to the web server – but the web server doesn’t know anything about a packet “from” 10.0.5.14 – it is expecting to receive a reply from the external IP address of 203.0.113.44! Thus we need to masquerade and force the reply to return to the lxcbr0 interface so the host can use connection tracking to rewrite the addresses and return the reply back from where it came.

# Hairpin
# http://ipset.netfilter.org/iptables-extensions.man.html
-A PREROUTING -i lxcbr+ -p tcp -d 203.0.113.44 -j MARK --set-mark 0x200/0x200
-A PREROUTING -i lxcbr+ -p tcp -d 203.0.113.44 -j nat_incoming_tcp
-A POSTROUTING -o lxcbr+ -m mark --mark 0x200 -j MASQUERADE

Note that because we had a custom chain (“nat_incoming_tcp”) to DNAT packets destined for web or email from the Internet – we can re-use this exact chain for traffic coming in from the LXC bridge interface (lxcbr0) as well.

The rules all together look like:

###############################################################################
# iptables IPv4 rules for reload on start-up
#
# To Reload:   
#   service netfilter-persistent restart
#
# To Test:
#   iptables-restore -t </etc/iptables/rules.v4
###############################################################################

*nat
:PREROUTING ACCEPT [0:0]
:POSTROUTING ACCEPT [0:0]
:OUTPUT ACCEPT [0:0]

# predefine chains
:nat_incoming_tcp - [0:0]

# redirect Internet TCP traffic to chain to NAT to appropriate LXC container
-A PREROUTING -i eth+ -p tcp -j nat_incoming_tcp

##################
# Hairpin from LXC
##################
# http://ipset.netfilter.org/iptables-extensions.man.html
-A PREROUTING -i lxcbr+ -p tcp -d 203.0.113.44 -j MARK --set-mark 0x200/0x200
-A PREROUTING -i lxcbr+ -p tcp -d 203.0.113.44 -j nat_incoming_tcp
-A POSTROUTING -o lxcbr+ -m mark --mark 0x200 -j MASQUERADE

###############
# LXC NAT rules
###############
# NAT for LXC container "email" at 10.0.5.14
-A nat_incoming_tcp -p tcp --dport 25 -j DNAT --to-destination 10.0.5.14:25 -m comment --comment "SMTP"

# NAT for LXC container "web" at 10.0.5.15
-A nat_incoming_tcp -p tcp --dport 80 -j DNAT --to-destination 10.0.5.15:80 -m comment --comment "HTTP"

COMMIT

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: