Monday, July 7, 2014

Upgrading ESXi 4.1 to 5.1 cause Emulex 10G NIC to disappear

I encountered this problem while upgrading IBM Bladecenter HS22 blades running IBM's custom version of ESXi 4.1 to 5.1. Upon reboot after the upgrade, I noticed that the ESXi host cannot be contacted by vCenter. There are 2 network cards available on the blade, a Broadcom gigabit adapter and the Emulex OneConnect 10G adapter. I used vSphere client to login directly to the management IP which seems to be working as it is running on the Broadcom adapter. Upon checking the configuration, I realised that the Emulex OneConnect 10G adapter is gone!

As I can't remember the exact model, what I did was to SSH into the ESXi and ran this command:
vmkchdev -l | grep vmnic2

My vmnic2 is bind to Emulex so you should change it to whichever vmnic number yours is bind to. The output will be something like this:
00:15:00.0 19a2:0700 10df:e630 vmkernel vmnic2

To interpret the codes, you break them down like this
VID = 19a2 (Vendor Id)
DID = 0700 (Device Id)
SVID = 10df (Sub-Vendor Id)
SDID = e630 (Sub-Device Id)

More info about other commands can be found here
http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=1031534

To figure out the device corresponding to the codes, go to the VMWare IO compatibility page
http://www.vmware.com/resources/compatibility/search.php?deviceCategory=io

On the right, input all the values in the additional criteria drop down list
Click on search and the result came back as OCe10102-FM-E Emulex OneConnect 10GbE.
Seems that this device is support on ESXi 5.1 so why is it not detecting? Surely it can't be the drivers now so I guessed it might be something to do with the existing firmware compatibility.

Went to search for the latest firmware directly from Emulex website (didn't bother with IBM since they came up with some weird logic about needing a valid support ticket before allowing you to download)
http://www.emulex.com/downloads/emulex/firmware-and-boot-code/oce10102/firmware-and-boot-code/

Downloaded the offline bootable Connect Flash ISO image which I mounted via the IBM Advanced Management Module (AMM). Upon reboot, press F12 to boot from CD-ROM and I went straight into a screen that detects the firmware version of my Emulex adaptor. The upgrade was pretty straight forward, just make sure that if you are using AMM remotely to mount the ISO, set the timeout to "no timeout". You won't want the firmware to be mounted from an ISO on your PC only for the AMM session to timeout and disconnect the virtual drive.... I dread to think of the consequences.

Once done, dismount the ISO and a quick reboot later, ESXi 5.1 will be happily detecting the Emulex adapter again.

If it doesn't work for you, try to upgrade the the Emulex driver to the latest version. You will never know what compatibility issues there are in running a NIC with newer firmware + older driver. The upgrade is pretty straight forward as well.
  1. Download the latest ESXi 5.x driver from the Emulex website. 
  2. Unzip and transfer the .vib file to the ESXi's local datastore. You can use WinSCP or any other method that you are most familiar with. I simply browse the datastore using my vSphere client and upload the driver vib file directly into the local hard drive. 
  3. SSH into the ESXi host 
  4. Run this command (Change the path to your local drive. The UUID which can be found in the storage tab of vSphere)
    esxcli software vib install -v /vmfs/volumes/4d502906-24871744-be9d-e41f13cdb3ee/driver/net-be2net-4.6.247.7-1OEM.500.0.0.472560.x86_64.vib
  5. After seeing a message saying the install is successful, reboot the ESXi

Issue with Citrix 4.5 on Windows Server 2003 after applying Microsoft patches

The problem first surfaced right after patching a bunch of servers in my office. Luckily, I split my patching of Citrix farm into several groups. It is never a good idea to patch your entire farm at the same time because when problems start to happen, you cannot isolate the issue.

One weekend, I was applying a whole chunk of patches (not a good idea either) due to the huge backlog from my predecessor. Come Monday, several users started complaining about having a blank or grey screen when they login to their Citrix applications. I tried on my workstation but could not replicate the problem thus dismissing it as a client side issue. As the day go on, more and more users started complaining about the problem and I decided to test it out using a different desktop.

These are my findings.

Affected client:
  • Citrix Xenapp ICA listener 4.5

Affected OS:
  • Windows XP
  • Windows 7 Enterprise (some Windows 7 work, not sure of the exact reason)

 Symptoms of problem
  • Citrix application opens a window that contains only grey/blank screen or light blue screen (depends on the colour of your server desktop)
  • Citrix application opens and automatically close within a second
  • Citrix launcher loads but application does not open

The issue only happened to applications hosted on Citrix presentation servers that were patched so obviously something in the patch caused this.

Solution:
  1. Uninstall Microsoft patch KB2922229 released in April 2014 on all affected Citrix presentation server
  2. Restart server after the uninstall
  3. Some servers might still be affected after uninstalling the patch. Go to the Citrix server policy to uncheck and re-check this option "User client's local time". Click apply and issue will be gone.

The issue was also reported by other users here
http://discussions.citrix.com/topic/350441-users-unable-to-login-to-xa-45-w2k3r2-server-after-kb2922229-and-kb2936068-installed/