I don’t think this can be that uncommon a scenario: a Windows Server 2008 R2 domain, with mainly HP printers. New domain controller added (at new site), this time running Windows Server 2012 R2; HP printers there too.
This was the position I found myself in earlier this year. On paper, there’s nothing unusual about this set-up. Adding new 2012 DCs and standard HP workgroup printers shouldn’t be a problem. That’s what we all thought.
Until the domain controller started becoming non-responsive.
Cue many, many hours on TechNet and various other similar sites, chasing down what I became increasingly sure must be some latent fundamental corruption in Active Directory (horrors!), revealed only by the introduction of the newer o/s. There were many intermediate hypotheses. At one point, we thought maybe it was because we were running a single DC (and it was lonely). Or that the DC was not powerful enough for its file serving and DFS replication duties. So I provisioned a second DC. Ultimately I failed all services over to that because the first DC was needing increasingly frequent reboots.
And then the second domain controller developed the same symptom.
Apart from the intermittent loss of replication and certain other domain duties, the most obvious symptom was that the domain controller could no longer initiate DNS queries from a command prompt. Regardless of which DNS server you queried. Observe:
C:\Users\rob>nslookup bbc.com
Server: UnKnown
Address: 192.168.1.1
*** UnKnown can’t find bbc.com: No response from server
C:\Users\rob>nslookup bbc.com 192.168.1.2
Server: UnKnown
Address: 192.168.1.2
*** UnKnown can’t find bbc.com: No response from server
C:\Users\rob>nslookup bbc.com 8.8.8.8
Server: UnKnown
Address: 8.8.8.8
*** UnKnown can't find bbc.com: No response from server
Bonkers, right? Half the time, restarting AD services (which in turn restarts file replication, Kerberos KDC, intersite messaging and DNS) brought things back to life. Half the time it didn’t, and a reboot was needed. Even more bonkers, querying the DNS server on the failing domain controller worked, from any other machine. DNS server was working, but the resolver wasn’t (so it seemed).
I couldn’t figure it out. Fed up, I turned to a different gremlin – something I’d coincidentally noticed in the System event log a couple of weeks back.
Event ID 4266, with the ominous message “A request to allocate an ephemeral port number from the global UDP port space has failed due to all such ports being in use.”
What the blazes is an ephemeral port? I’m just a lowly Enterprise Architect. Don’t come at me with your networking mumbo jumbo.
Oh wait, hang on a minute. Out of UDP ports? DNS, that’s UDP, right?
With the penny slowly dropping, I turned back to the command line. netstat -anob
lists all current TCP/IP connections, including the name of the executable (if any) associated to the connection. When I dumped this to a file I quickly noticed literally hundreds of lines like this:
TCP 0.0.0.0:64838 0.0.0.0:0 LISTENING 4244
[spoolsv.exe]
As it happened, this bit of research coincided with the domain controller being in its crippled state. So I restarted the Print Spooler service, experimentally. Lo and behold, the problem goes away. Now we’re getting somewhere.
Clearly something in the printer subsystem is grabbing lots of ports. Another bell rang – I recalled when installing printers on these new domain controllers that instead of TCP/IP ports, I ended up with WSD ports.
What on earth is a WSD port?! (Etc.)
So these WSD ports are a bit like the Bonjour service, enabling computers to discover services advertised on the network. Not at all relevant to a typical Active Directory managed workspace, where printers are deployed through Group Policy. WSD ports (technically monitors, not ports) are however the default for many new printer installations, in Windows 8 and Server 2012. And as far as I can tell, they have no place in an enterprise environment.
Anyway, to cut a long story short (no, I didn’t did I, this is still a long story, sorry!), I changed all the WSD ports to TCP/IP ports. The problem has gone away. Just like that.
I spent countless hours trying to fix these domain controllers. I’m now off to brick the windows* at Microsoft and HP corporate headquarters.
Hope this saves someone somewhere the same pain I experienced.
Peace out.
*Joke
Thanks!