SCCM 2012 Troubleshooting PXE – Jocha

PXE booting makes deploying OS images much simpler for end user technicians. There is a lot that can go wrong though, especially if you’re attempting to run it in a high security, heavily filtered network.

When a PXE failure occurs it helps to be very precise with the step it failed at. The place at which a PXE build fails can tell us where to investigate.

Some possible causes of error in a PXE build are:

  • Workstation BIOS configuration and/or lack of RAM
  • Duplicate SMBIOS id (typically seen on older hardware)
  • DHCP Server configuration
  • Network filters / configuration
  • WDS service failure
  • PXE service point failure
  • Wrong collection membership in SCCM
  • WDS cached collection membership
  • Obsolete objects in SCCM
  • Network drivers for Vista/7 are available, but not for XP
  • Network drivers are not available for Vista/7

On the server side there’s one log file that will help you immensely. If you have set up the PXE Service Point on the site server it can be found at:

%ProgramFiles%\SMS_CCM\Logs\smspxe.log

Or, if you have configured another server as the PXE Service Point it will be found at:

SMS_CCM\Logs\smspxe.log

in the root of the drive SCCM is using.

Using Trace32 to view this log file can give you realtime information on the PXE boot process. However, the first error we’ll look at won’t even show up in this log.

PXE-E51: No DHCP or proxyDHCP offers were received

The most common PXE error I see is PXE-E51. The first indication that something is wrong is when you see

DHCP...

and you get more than three or four dots.

The PXE process fails at this point with PXE-E51: No DHCP or proxyDHCP offers were received.

This error basically says that the machine can’t obtain an IP address. Possible reasons for this include

  1. Your DHCP server isn’t working
  2. If you use DHCP reservations you may have made a mistake entering the MAC address of this machine
  3. You don’t have a DHCP pool set up for this subnet, or the pool has no free addresses
  4. Your DHCP server is on a different subnet and you haven’t set up an IP forwader or DHCP Relay agent
  5. The network cable or port is broken

Most of these problems are easy to check, or are easy for your networking people to check. Once the problem is fixed the PXE boot process works properly in most cases. Assuming your network is configured to allow PXE booting this error normally means one of two things – the cable is faulty or there’s no DHCP reservation/the DHCP pool is exhausted.

This error highlights the need for preciseness in the error reports from your technicians. Since there’s so much more that can go wrong at this stage, it’s nice to have an error which is relatively easy to fix.

PXE-E32: TFTP Open Timeout

Assuming your client gets an IP address, there is still a large number of ways for it to fail before you even get an abortpxe.com message. PXE-E32 TFTP open timeout can be a frustrating message – but it does at least give you a clue where to look.

This error means that your client machine can’t access the TFTP daemon running on your PXE Service point. Assuming your PXE Service Point is set up correctly (check the WDS service is running), the most common reason for this message is network filters/firewall settings. Fortunately, Microsoft provide a document which lists what ports need to be open for the TFTP daemon to work. Read this document carefully, you need to open more than just ports 69 and 4011 to get this to work. The daemon listens on port 69 but responds on a randomly chosen high port. You’ll need to configure the network filter rules to allow this behavior before TFTP will work.

You might also see this error if DHCP is misconfigured. If you have DHCP and the PXE Service point on different servers then you’ll need to set option 66, the Boot Server Host Name. A small tip here – use the IP address of the PXE Service Point when troubleshooting this setting – this removes the possibility that it’s a DNS resolution issue. You can always set it back once you’re happy everything is back working.

PXE-E53: No boot filename received

Check option 67 on the DHCP server. It should be something like

smsboot\x86\wdsnbp.com

Also verify that the Task Sequence is deployed to the computer object, and the collection is working as intended.

PXE-E55: ProxyDHCP service did not reply to request on port 4011

Related to the TFTP timeout problem, this suggests a firewall or routing issue. Check the firewall settings allow 4011 UDP through.

If the client and the PXE Service Point are on different subnets, check that the traffic is being forwarded from the client subnet to the PXE Service Point.

PXE-E3B: TFTP error file not found

At this point we know the client is getting service from DHCP and has managed to find the TFTP server and request the boot file. Two things to check here are

  1. Option 67 is configured correctly and pointing to a file that exists on the server
  2. The files are actually on the TFTP server

Check the SMSBoot folder in the reminst share on the PXE Service Point. There should be 3 folders in the SMSBoot folder – ia64, x64 and x86. Each folder should contain some boot files. If not, you have problems!

The missing boot files can be fixed in a number of ways. The easiest way is to just copy the correct files over from a working PXE Service Point. I would not recommend this though – the files are missing for a reason, and you should really fix the underlying cause.

This error can be caused by a number of things- updating drivers in the default OSD Boot Images, restarting the server hosting the PXE Service Point or just a botched PXE Service Point install. The first thing you should try is clearing out temp files used by PXE.

  • Stop the WDS Service
  • Delete (or move) the folder
    %temp%\PXEBootFiles

  • Start the WDS Service

If this doesn’t work it might be a more fundamental problem with the PXE Service Point. Remove the role from the server, restart the server hosting the PXE Service Point and Add the role back.

Other pre-Windows PE errors

  • \Boot\BCD error

    Assuming you can get past abortpxe.com, there is another error you can see at this stage. After pressing the F12 key to PXE boot you can sometimes see


    Windows Boot Manager (Server IP: x.y.z.a)
    
    Windows failed to start. A recent hardware or software change might be the cause.
    
    File: \Boot\BCD
    Status: 0xc000000f Info: An error occurred while attempting to read the boot configuration data.

    The simple solution is to delete the computer object and recreate it, which should fix this problem. I’ve only ever seen this problem with SCCM 2007 SP2 when deploying Windows 7.

    This does look like a bcd error, but in the SCCM implementation of WDS there is no single boot.bcd file, the boot.bcd file is created on the fly in the

    RemoteInstall\SMSTemp

    folder with a name of year.month.day.hour.minute.number.number.guid.boot.bcd.

    If anyone knows the actual fix for this (without having to delete the computer object) please post in the comments!

  • Only using 32-bit boot images when you have 64-bit machines in your environment

    Again, this one seems a bit odd. If your workstation is 64-bit (and you’d be hard pressed to find a non-64-bit machine these days), then you need the 64-bit boot files available – even if you are only deploying 32-bit Windows, and are using a 32-bit boot image. The 64-bit boot files are extracted from the boot image and used during the initial PXE process, so if they’re missing, you won’t be able to PXE boot a 64-bit machine.

    If you’re getting this error, you’ll see something like this in smspxe.log

    The SMS PXE Service Point does not have a boot image matching the processor
    architetcure of the PXE booting device.

Original post can be found here.