Missing DHCP Replies

After I upgraded my WRT3200ACM router to OpenWrt v23.05.3, my laptop took minutes to get an IPv4 address when connected to my WRT3200ACM router’s WiFi. It only takes seconds to get an IPv4 address when connected to my other WiFi.

I ended up downgrading OpenWrt on the WRT3200ACM, but first I did a lot of work trying to track down the problem.

Here’s what a portion of my network looks like:

diagram of network for this post

I have the kea DHCP server running on the Qotom server. I have dhcpcd running on the laptop.

The WRT3200ACM runs OpenWrt, and has a bridged LAN, which means the radios and RJ-45 ethernet ports are one broadcast segment. If I cabled the Qotom server to one of the LAN RJ-45 ports, kea can reply to DCHP requests made by my laptop after it’s authenticated to the radio(s).

Running tcpdump -i wlp1s0 on my laptop showed that dhcpcd made numerous DHCP requests, but didn’t get any answers for quite a while. Below, tcpdump output. My laptop’s MAC address is e4:a4:71:30:21:86. kea-dhcp4 is running on 172.24.0.1.

 1	2024-04-17 15:34:26.350086 IP6 :: > ff02::16: HBH ICMP6, multicast listener report v2, 1 group record(s), length 28
 2	2024-04-17 15:34:26.525390 IP6 :: > ff02::1:ff6d:957b: ICMP6, neighbor solicitation, who has fe80::9402:3c1e:566d:957b, length 32
 3	2024-04-17 15:34:26.989267 IP6 :: > ff02::16: HBH ICMP6, multicast listener report v2, 1 group record(s), length 28
 4	2024-04-17 15:34:27.415887 IP 0.0.0.0.68 > 255.255.255.255.67: BOOTP/DHCP, Request from e4:a4:71:30:21:86, length 300
 5	2024-04-17 15:34:27.565167 IP6 fe80::9402:3c1e:566d:957b > ff02::16: HBH ICMP6, multicast listener report v2, 1 group record(s), length 28
 6	2024-04-17 15:34:27.565887 IP6 fe80::9402:3c1e:566d:957b > ff02::2: ICMP6, router solicitation, length 16
 7	2024-04-17 15:34:27.917358 IP6 fe80::9402:3c1e:566d:957b > ff02::16: HBH ICMP6, multicast listener report v2, 1 group record(s), length 28
 8	2024-04-17 15:34:30.573376 IP 0.0.0.0.68 > 255.255.255.255.67: BOOTP/DHCP, Request from e4:a4:71:30:21:86, length 300
 9	2024-04-17 15:34:31.567336 IP6 fe80::9402:3c1e:566d:957b > ff02::2: ICMP6, router solicitation, length 16
10	2024-04-17 15:34:32.440654 IP 0.0.0.0.68 > 255.255.255.255.67: BOOTP/DHCP, Request from e4:a4:71:30:21:86, length 300
11	2024-04-17 15:34:35.570931 IP6 fe80::9402:3c1e:566d:957b > ff02::2: ICMP6, router solicitation, length 16
12	2024-04-17 15:34:35.974216 IP 0.0.0.0.68 > 255.255.255.255.67: BOOTP/DHCP, Request from e4:a4:71:30:21:86, length 300
13	2024-04-17 15:34:39.574937 IP6 fe80::9402:3c1e:566d:957b > ff02::2: ICMP6, router solicitation, length 16
14	2024-04-17 15:34:43.917214 IP 0.0.0.0.68 > 255.255.255.255.67: BOOTP/DHCP, Request from e4:a4:71:30:21:86, length 300
15	2024-04-17 15:34:43.937481 IP 172.24.0.1.67 > 172.24.0.7.68: BOOTP/DHCP, Reply, length 334
16	2024-04-17 15:34:43.938082 IP 0.0.0.0.68 > 255.255.255.255.67: BOOTP/DHCP, Request from e4:a4:71:30:21:86, length 300
17	2024-04-17 15:34:43.946884 IP 172.24.0.1.67 > 172.24.0.7.68: BOOTP/DHCP, Reply, length 334
18	2024-04-17 15:34:43.947433 ARP, Request who-has 172.24.0.7 tell 0.0.0.0, length 28
19	2024-04-17 15:34:45.206090 ARP, Request who-has 172.24.0.7 tell 0.0.0.0, length 28
20	2024-04-17 15:34:47.072357 ARP, Request who-has 172.24.0.7 tell 0.0.0.0, length 28
21	2024-04-17 15:34:49.077026 ARP, Request who-has 172.24.0.7 tell 172.24.0.7, length 28
22	2024-04-17 15:34:51.079467 ARP, Request who-has 172.24.0.7 tell 172.24.0.7, length 28

Lines 4, 8, 10, 12 and 14 show my laptop sending out DHCPREQUEST messages, asking someone, anyone, for IPv4 address 172.24.0.7, which is apparently what that interface had before. dhcpcd (on my laptop) would have sent a DHCPDISCOVER message otherwise. The format above doesn’t show all the details of the packets.

The important aspect to notice is that dhcpcd on the laptop makes 5 DHCPREQUESTs to get a single DHCPOFFER. appearing on line 15, after the fifth DHCPREQUEST. What the hell is going on? That’s 16 seconds after the first DHCPREQUEST. kea not responding was my first guess.

The corresponding dhcpcd log from my laptop:

 1  Apr 17 15:34:26 hazard dhcpcd[321]: wlp1s0: carrier acquired
 2  Apr 17 15:34:26 hazard dhcpcd[321]: wlp1s0: connected to Access Point: bongrips4jesus42069
 3  Apr 17 15:34:26 hazard dhcpcd[321]: wlp1s0: IAID 71:30:21:86
 4  Apr 17 15:34:26 hazard dhcpcd[321]: wlp1s0: adding address fe80::9402:3c1e:566d:957b
 5  Apr 17 15:34:26 hazard dhcpcd[321]: wlp1s0: soliciting an IPv6 router
 6  Apr 17 15:34:27 hazard dhcpcd[321]: wlp1s0: rebinding lease of 172.24.0.7
 7  Apr 17 15:34:32 hazard dhcpcd[321]: wlp1s0: DHCP lease expired
 8  Apr 17 15:34:32 hazard dhcpcd[321]: wlp1s0: DHCP lease expired
 9  Apr 17 15:34:32 hazard dhcpcd[321]: wlp1s0: soliciting a DHCP lease
10  Apr 17 15:34:39 hazard dhcpcd[321]: wlp1s0: no IPv6 Routers available
11  Apr 17 15:34:43 hazard dhcpcd[321]: wlp1s0: offered 172.24.0.7 from 172.24.0.1
12  Apr 17 15:34:43 hazard dhcpcd[321]: wlp1s0: probing address 172.24.0.7/16
13  Apr 17 15:34:49 hazard dhcpcd[321]: wlp1s0: leased 172.24.0.7 for 3600 seconds
14  Apr 17 15:34:49 hazard dhcpcd[321]: wlp1s0: adding route to 172.24.0.0/16
15  Apr 17 15:34:49 hazard dhcpcd[321]: wlp1s0: adding default route via 172.24.0.1

Line 6 of the dhcpcd log chronologically corresponds to line 4 of the tcpdump decoding above. Line 11 of the dhcpcd log chronologically corresponds to line 17 of the tcpdump decoding. This confirms that dhcpcd doesn’t see any DHCPOFFER, DHCPACK, DHCPNAK or any other reply from kea.

Conveniently, I’m not documenting my first test. I had tcpdump -i enp4s0 -w some.file.pcap running on the Qotom server. Here’s the decoded PCAP:

 1	2024-04-17 15:34:26.361338 e4:a4:71:30:21:86 > ff:ff:ff:ff:ff:ff Null Unnumbered, xid, Flags [Response], length 6: 01 00
 2	2024-04-17 15:34:26.525843 IP6 :: > ff02::1:ff6d:957b: ICMP6, neighbor solicitation, who has fe80::9402:3c1e:566d:957b, length 32
 3	2024-04-17 15:34:26.989719 IP6 :: > ff02::16: HBH ICMP6, multicast listener report v2, 1 group record(s), length 28
 4	2024-04-17 15:34:27.416662 IP 0.0.0.0.68 > 255.255.255.255.67: BOOTP/DHCP, Request from e4:a4:71:30:21:86, length 300
 5	2024-04-17 15:34:27.419969 IP 172.24.0.1.67 > 172.24.0.7.68: BOOTP/DHCP, Reply, length 334
 6	2024-04-17 15:34:27.566643 IP6 fe80::9402:3c1e:566d:957b > ff02::16: HBH ICMP6, multicast listener report v2, 1 group record(s), length 28
 7	2024-04-17 15:34:27.567030 IP6 fe80::9402:3c1e:566d:957b > ff02::2: ICMP6, router solicitation, length 16
 8	2024-04-17 15:34:27.917795 IP6 fe80::9402:3c1e:566d:957b > ff02::16: HBH ICMP6, multicast listener report v2, 1 group record(s), length 28
 9	2024-04-17 15:34:30.574167 IP 0.0.0.0.68 > 255.255.255.255.67: BOOTP/DHCP, Request from e4:a4:71:30:21:86, length 300
10	2024-04-17 15:34:30.577152 IP 172.24.0.1.67 > 172.24.0.7.68: BOOTP/DHCP, Reply, length 334
11	2024-04-17 15:34:31.567694 IP6 fe80::9402:3c1e:566d:957b > ff02::2: ICMP6, router solicitation, length 16
12	2024-04-17 15:34:32.443312 IP 0.0.0.0.68 > 255.255.255.255.67: BOOTP/DHCP, Request from e4:a4:71:30:21:86, length 300
13	2024-04-17 15:34:32.446095 IP 172.24.0.1.67 > 172.24.0.7.68: BOOTP/DHCP, Reply, length 334
14	2024-04-17 15:34:35.571498 IP6 fe80::9402:3c1e:566d:957b > ff02::2: ICMP6, router solicitation, length 16
15	2024-04-17 15:34:35.974960 IP 0.0.0.0.68 > 255.255.255.255.67: BOOTP/DHCP, Request from e4:a4:71:30:21:86, length 300
16	2024-04-17 15:34:35.978001 IP 172.24.0.1.67 > 172.24.0.7.68: BOOTP/DHCP, Reply, length 334
17	2024-04-17 15:34:39.576770 IP6 fe80::9402:3c1e:566d:957b > ff02::2: ICMP6, router solicitation, length 16
18	2024-04-17 15:34:43.919164 IP 0.0.0.0.68 > 255.255.255.255.67: BOOTP/DHCP, Request from e4:a4:71:30:21:86, length 300
19	2024-04-17 15:34:43.922934 IP 172.24.0.1.67 > 172.24.0.7.68: BOOTP/DHCP, Reply, length 334
20	2024-04-17 15:34:43.938673 IP 0.0.0.0.68 > 255.255.255.255.67: BOOTP/DHCP, Request from e4:a4:71:30:21:86, length 300
21	2024-04-17 15:34:43.944427 IP 172.24.0.1.67 > 172.24.0.7.68: BOOTP/DHCP, Reply, length 334
22	2024-04-17 15:34:43.947755 ARP, Request who-has 172.24.0.7 tell 0.0.0.0, length 46
23	2024-04-17 15:34:45.207743 ARP, Request who-has 172.24.0.7 tell 0.0.0.0, length 46
24	2024-04-17 15:34:47.074012 ARP, Request who-has 172.24.0.7 tell 0.0.0.0, length 46
25	2024-04-17 15:34:49.078737 ARP, Request who-has 172.24.0.7 tell 172.24.0.7, length 46
26	2024-04-17 15:34:51.081142 ARP, Request who-has 172.24.0.7 tell 172.24.0.7, length 46

Line 4 in the server’s tcpdump chronologically matches line 4 in the laptop’s tcpdump. Laptop records 2024-04-17 15:34:27.415887 as the timestamp for the DHCPREQUEST, which the server records 2024-04-17 15:34:27.416662, 0.000775 seconds later. NTP for the win.

Lines 4, 9, 12, 15, 18 match the five DHCPREQUESTs appearing in the laptop’s tcpdump, lines 4, 8 , 10, 12 and 14. Lines 5, 10, 13, 16 and 19 show DHCPACK messages from kea on the Qotom server to 172.24.0.7, with the laptop’s MAC address. That’s the goofy, ugly part of DHCP: a UDP packet get sent to an IPv4 address and a MAC address where the computer with that MAC address doesn’t have the IPv4 address yet.

I don’t have a PCAP from running tcpdump on the OpenWrt 23.05.3 br-lan bridged network, but you’ll have to believe me that the DHCPACK messages from kea appear on the bridged network.

Other Things I tried

The same sort of “missing DHCPACK” problem showed up when I used other USB WiFi dongles, and when I used dhclient instead of dhcpcd on the laptop. I put kea on my production server to see if kea misbehaved in some subtle way. I downgraded my current laptop to

linux-zen-5.13.5.zen1-1
linux-zen-headers-5.13.5.zen1-1
linux-firmware-20210716.b7c134f-1

because an old laptop from my scrap heap with those package versions subjectively seemed to get an address faster. No luck, still took ages to get an address most of the time.

The dhtest program would exhibit the same problem between when wpa_supplicant connected to the WRT3200ACM’s radio, and when dhcpcd acquired an address. That is, it would send repeated DHCPDISCOVER packets and not get any kind of reply.

I should have turned off dhcpcd on my laptop, then run dhtest. It sends a DHCPDISCOVER every 5 seconds until it gets an answer.

Annoying aspects of the problem

  1. It didn’t occur every time I switched my laptop to the WRT3200ACM’s network
  2. It didn’t occur if I used a CAT-5 cable to connect my laptop to one of the WRT3200ACM’s LAN ports
  3. It took various lengths of time to get a DHCPACK from kea, from 0 to 389 seconds, with a mean of 71.3 seconds and a median of 25 seconds. The first quartile is 7 seconds, which means at least 25% of connections take a normal amount of time.
  4. I couldn’t definitively tell if other devices had the problem. This is a consequence of (1) and things like not having a DHCP log file for my Google Pixel 6a phone.

Others with the same problem

Found a reddit post where someone complains about packet loss and lower speed with OpenWrt versions greater than 19.07 on a WRT3200ACM router. The reddit post even says that cabled connections work perfectly.

A fair amount of packet loss could explain the missing DHCPACK/DHCPOFFER messages. DHCP is conducted using UDP, there’s no attempt at reliable delivery. The only impediment to adopting this theory is IPv6 and ARP packets. There’s an exact correspondence between IPv6 packets in the two PCAPs, and the ARP packets broadcast by dhcpcd when it finally receives a DHCPACK.

Only the DHCPACK packets are missing in the laptop’s PCAP.