Network goes down shortly after reboot with Debian Buster



I recently upgraded an existing Debian based virtual machine (running under KVM) to Debian Buster (the latest stable Debian Release as of this writing).

After reboot everything seemed fine but shortly after the reboot (about 10 Minutes) the machine was no longer reachable. After a reboot the same thing happened again.

I then turned on VNC (the machine hosting that VM is on a hosted infrastructure and only reachable via network, so I had to forward a VNC port to my local machine for testing) and discovered that the machine was running fine – just without a network connection. The interface was up but did not have an IPv4 address.

Investigating further I discovered that the machine was coming up with a wrong time (one hour after the current time, i.e. clock in the future), and the time was set back one hour by ntp once it was starting up, from the log:

Jan 22 19:19:35 tux4 dhclient[351]: DHCPOFFER of 10.33.33.4 from 10.33.33.254
...
Jan 22 19:19:36 tux4 ntpd[391]: ntpd 4.2.8p12@1.3728-o (1): Starting

Jan 22 19:19:36 tux4 ntpd[420]: kernel reports TIME_ERROR: 0x41: Clock Unsynchronized
Jan 22 18:19:47 tux4 ntpd[420]: receive: Unexpected origin timestamp 0xe1d310c2.72d51f2c does not match aorg 0000000000.00000000 from server@XX.XXX.XXX.XXX xmt 0xe1d302b3.0dae523c

Notice how the time changed to one hour earlier in the last log entry.

So, my next move was to check the dhcp leases file in /var/lib/dhcp/dhclient.eth0.leases and I found:

renew 3 2020/01/22 18:39:54;
rebind 3 2020/01/22 18:44:51;
expire 3 2020/01/22 18:46:06;

So the lease had expired. Obviously the ISC dhpc server was able to remove the IP address from the interface when the lease expired but did not renew the lease in time (probably this would have happened one hour later when the time again reached that point it was set back one hour). Looks like the ISC dhcp server uses two different mechanisms for timing these events: The lease expiry is timed with a mechanism that still works when the clock is set back in time (see the leases file, the time is the correct one, not one hour in the future). The renewal does not. I haven't looked into the code to figure out why this is so.

So what was the root cause for the wrong time? It turns out that I had always configured this machine to run in localtime (CET is one hour ahead of UTC). So it must always have been wrong. But the problem above only appeared with Debian Buster.

Now I'm starting the machine with base UTC and everything works again as expected. I'm directly using KVM so the correct option turned out to be:

-rtc base=utc