iptables Journey of Discovery

I bought a Qotom Q20332G9-S10 fanless server, which I hope will eventually replace my older Dell PowerEdge R530, which was manufactured in August - December of 2014.

This started me on a journey of discovery that I could scarcely have imagined six months ago.


Hero’s Journey Diagram

Dell PowerEdge R530s have four built-in gig ethernet ports. I used one ethernet port to connect to the Internet, courtesy of Centurylink fiber. I’m not getting the full benefit of the fiber’s speed, because Centurylink uses PPP-over-ethernet.

My house is fairly new. It has CAT-5 ethernet cabling to maybe 5 or 6 rooms of the house. Initially, I put WiFi access points in two rooms, using 10.0.0.0/24, 10.0.10.0/24 thereby consuming two more ethernet ports.

I wrote an elaborate systemd service file that assigned addresses and subnets to ethernet ports, set up iptables masquerading so that all of the 3 subnets would get NAT IP addresses when routed to the Internet, and clamped MSS to Path MTU discovery.

I finally figured out which cable in the furnace room ran to an upstairs bedroom. I added my last spare WiFi access point on the 10.0.20.0/24 network. That’s all four built-in ethernet ports on the the Dell PowerEdge R530.

I must have executed iptables-save somewhere in here. The file /etc/iptables/iptables.rules had rules for NATing 10.0.0.0/24, 10.0.10.0/24, 10.0.20.0/24 subnets. I commented out the iptables commands in the elaborate systemd unit file, relying on the iptables.service unit to set up NAT and so forth on reboots, and then promptly forgot about doing so.

The older PC that the R530 replaced as my server had 2 short PCI Intel ethernet cards that fit in the R530’s 1U case. I scavenged them from the older PC, and put them in the R530, which now has a total of 6 ethernet ports. I put another WiFi AP in a third room, and set up one of the two PCI Intel ethernet cards on 10.0.30.0/24. That’s five of six ethernet ports earning their keep.

Every once in a while, my phone would claim that my WiFi “didn’t have internet access”. This was strange because my laptop could always dig fully-qualified domain names, and traceroute to well-known sites like yahoo.com.

On the other hand, it wasn’t so strange, because the WiFi access points are Linksys Velop proprietary mystery boxes. You have to use “The App” to manage them, and all labels are baby talk. Supposedly they do “mesh routing”, but other than a bunch of weird traffic on 192.168.0.0/24, I don’t see them doing anything special.

Periodically, my work laptop would refuse to set up the company VPN when it was attached via the wireless network, but not when plugged in to my network via CAT-5 cabling (10.0.0.0/24 subnet).

I blamed these seemingly unrelated problems on the large amount of DNS black-holing I’ve done. Advertising is all lies, I want to avoid seeing any ads. Advertisers can go to hell.

This January, after overwriting the probably pirated Windows installation on the Qotom fanless server with Arch Linux, I connected it to the R530 Edge server via the second Intel PCI ethernet card. I took these further steps on the R530:

  1. Assigned 10.0.40.1/24 to that ethernet card.
  2. Fooled around with /etc/dhcpd.conf to give the Qotom an IPv4 address.
  3. Added ens5 to /etc/dnsmasq.conf to get DNS service on the 10.0.40.0/24 subnet.
  4. Added allow 10.0.40.0/24 to file /etc/chrony.conf to convince the Chrony NTP demon to service requests from that subnet.

Everything was awesome, until I tried my weekly pacman update on the Qotom. pacman appeared to hang at “Synchronizing package databases…”

After some futzing around, I discovered that the Qotom server could access IP addresses on my LAN (5 broadcast segments, 10.0.0.0/24, 10.0.10.0/24, 10.0.20.0/24, 10.0.30.0/24, 10.0.40.0/24), but couldn’t access anything outside my LAN. ping google.com didn’t work, traceroute google.com didn’t work, but dig google.com did. My laptop, which I thought was on the same network but wirelessly, could do everything. WTF?

The first thing I tried, tcpdump -i eno3.201, showed me that packets from 10.0.30.0/24 and 10.0.40.0/24 addresses were getting spewed out to the Centurylink ODNT. This is embarrassing. You’re not supposed to route 10.0.0.0/8 IP addresses, although it’s rumored that Iran uses 10.0.0.0/8 country-wide.

There it is. Packets with non-routeable source addresses are not getting NATted. But why? Remember, at this point I’ve I forgot that iptables.service was setting up NAT for only three subnets. The elaborate systemd service file had all the iptables commands commented out. I had a vague idea that maybe running iptables saved the rules somewhere, which doesn’t entirely make sense, but it was all I had.

Doing systemctl lists the units systemd knows about. One of them is iptables.service. Running systemctl status iptables.service led me to the file /etc/iptables/iptables.rules.

# Generated by iptables-save v1.8.7 on Tue Jan 26 07:16:40 2021

I hadn’t added 10.0.30.0/8 or 10.0.40.0/8 to the set of saved rules. The R530 server has some Centurylink bullshit router as its default route, so everything from those two subnets was going out the fiber optic to Centurylink’s terrible network. Hopefully, Centurylink is still capable of filtering bogons.

Adding a few iptables rules updated for 10.0.30.0/8 and 10.0.40.0/8 to /etc/iptables/iptables.rules, running sudo systemctl restart iptables.service made it so the Qotom candidate server hardware could do pacman -Syu, and reach well-known FQDNs in the Outer World. My laptop typically connected to the Velop on the 10.0.20.0/8 subnet, so it never had a problem - IP addresses on that subnet always got NATted correctly. We’ll have to see if “internet is not accessible” happens again, but I suspect it won’t. The two “unrelated” problems had a common solution.