vCenter UCS Alarm: IPMI SEL, SEL_FULLNESS

Summary:
This alarm means that server's CIMC system event log has filled up.  Below you will find the steps to clear this type alarm.

PreReq:
This assumes you are utilizing B-series UCS servers.  C-series may be slightly different in practice.

Resolution:

  1. If you're vCenter is configured w/ default alarms, you'll likely see something like pictured below in vCenter under the hardware status tab:
  2. To clear this alert, you'll need to empty the SEL Logs in UCS of the blade related to your service profile.  You are not likely to find SEL Logs as part of the service profile.

  3. Once you've opened the related blade, select the SEL Logs tab.  Review and/or export the logs so you simply do not clear something that may need to be investigated.  Once done, you can safely clear the logs:
  4. Once the SEL Logs have cleared, the alert in vCenter should reset to green in a few minutes.

How to: tcpdump UCS Management traffic.

Rather than regurgitate all the information whole here is the skinny:

  1. SSH into your UCS chassis (aka primary fabric interconnect)
  2. connect nxos
  3. ethanalyzer local interface mgmt limit-captured-frames 2000 write volatile:/mycapture.cap
    1. ethanalyzer is the command
    2. local is default
    3. interface so we can tell it where we want to capture packets from.
    4. mgmt is the one I'm interested in
    5. limit-capture-frames is there because it limits to 10 by default and is way too fast when troubleshooting.
    6. write to output a capture file located in volatile memory (deletes when FI is rebooted.)
  4. Exit
  5. connect local-mgmt
  6. cp volatile:/mycapture.cap scp://username@linuxservername/somepath
    1. Where 'scp' is defined, can be ftp, sftp, tftp, volatile, or workspace as well.
    2. The capture file can be read in applications like wireshark.
This helped me figure out my LDAP Authentication issues.

Full article and explanation of how to do what I've outlined above was found here:

Thanks to Jeff for his write-up, otherwise I would've never gotten anywhere with TAC.

Command to search for LDAP related commands.
ethanalyzer local interface mgmt capture-filter "tcp port 389" limit-captured-frames 2000 write volatile:/mycapture.cap

or if you're using LDAPS you need to scan port 636, although not sure if it will be useful data.

ethanalyzer local interface mgmt capture-filter "tcp port 636" limit-captured-frames 2000 write volatile:/mycapture.cap

UCS bug around Active Directory

Update:
UCS 2.1 addresses this particular issue.  Bug ID: CSCth96721

Summary:
Found an interesting UCS bug on 2.0(3b).  May be resolved in 2.0(4d), but have not tested yet.  This particular problem only manifests itself if your Active Directory tree structure is elaborate and causes a user account's distinguishedName to be longer than 128 characters.

Detailed:
Essentially UCS queries Active Directory w/ samAccountFilter, it receives the results of the query.  It then makes a bind call against the DN using the results it received.  The problem is the DN bind call variable on the UCS side seems to be limited to 128 characters which it then truncates the information when it makes the bind call.

Workaround:
The only real workaround is to move the affected account to another a higher level OU to shorten it's distinguished name.

Powershell:
You can use powershell to determine the length of your distinguished name by utilizing the Quest ActiveRoles PS snapin.
(Get-QADUser UserName).DN.Length

Jing, IIS, SWF, and Powershell fun

I've been using Jing to record short tutorial videos and uploading them to my IIS server's directory.  To view or share them I would have to create a simple HTML file.  I decided to automate this process by simply having the formatted HTML file generated when I threw a swf file into the directory using powershell.


# Here is the local directory on the IIS server where I'm throwing my swf files.
# This script is meant to run as a schedule task every 5 minutes or more if you like.
$VidPath = "D:\inetpub\wwwroot\videos"

# Here I'm querying for all the swf files in the directory.
$SWFFiles = get-childitem $VidPath | ? {$_.Extension -match ".swf"}

# This is where I begin to look @ each swf file and check whether they have an associated html file.
foreach ($SWFFile in $SWFFiles)
{
$HTMLCheck = $null
$HTMLCheck = Get-ChildItem $VidPath | where {$_.basename -eq $SWFFile.basename -and $_.Extension -ne $SWFFile.Extension}
# If I did not find an associated html file, this is where I would create one.
If ($HTMLCheck -eq $null)
{
$HTML = "<object width=`"100%`" height=`"100%`"> `
<param name=`"movie`" value=`"./$($SWFFile.name)`"> `
<embed src=`"./$($SWFFile.name)`" width=`"100%`" height=`"100%`">`
</embed> `
</object>"
$HTML | Out-File "$($VidPath)\$($SWFFile.Basename).html" -Encoding ASCII
}
}

I use this script in conjunction w/ my iPad directory script for fun.


Powershell, WMI, Local Computer Description, and value out of range error...

Summary:
Needed to update local computer description on servers that I own.  Easy peasy w/ powershell, or so I thought.

PreRequisites:

  1. Powershell 2.0+
  2. Quest.ActiveRoles.AdManagement Snapin
  3. SysInternal PSExec
Details:
Windows Server 2008 and 'Vista' based kernel systems seem to have some kind of WMI bug.  Searching the web has turned up only a mention of something regarding the use of "ItemIndex".  I'm @ a loss.  This script will work for 2008 R2 systems and the only work around appears to make use of sysinternals psexec cmd to call out the net config command on the local system.

Add-PSSnapin quest.activeroles.admanagement
$servers = Get-QADComputer -Name "someprefix*"
$Description = "Something I want to insert" 

foreach ($computer in $servers)

{
# Simply a check to see whether the system is active or not.
$Ping = Get-WmiObject Win32_PingStatus -Filter "Address = '$($computer.name)'" | Select StatusCode
If ($Ping.StatusCode -eq 0)
{
# This will work for all Windows versions.  
# I'm calling the live version, but for speed you may want to download it to your local system.
$computer.Name
\\live.sysinternals.com\tools\psexec.exe \\$computer.name net config server /srvcomment "$($Zone)"
# This will work for 2008 R2 systems and above
# It will return a "Value out of range" error on 2008 systems.
Set-WmiInstance -ComputerName $computer.Name -Path Win32_OperatingSystem=@ -Arguments @{description=$Description}
}
else {
Write-Host "$($computer.name) unreachable"
}
}

Value out of Range issue:
This error really bothers me even if it is for a relatively small problem.  Here is what occurs:
  1. powershell returns the expected data in the description field.
  2. When attempting to modify the value locally and remotely returns a "value out of range" error.
  3. The type is string and I don't see any reason why it shouldn't update.
  4. Only appears to affect 'Vista' based Operating Systems, such as Windows 2008 Server.
It's most definitely a WMI problem, but I'm rather stumped.

Configure ESXi Scratch Config w/ Powershell/PowerCLI and other advanced settings...

Summary:
Needed to script configure all my 100+ ESXi hosts w/ a scratch location.  Having a permanent scratch location configured is helpful when an error such as a purple screen of death (PSOD) occurs on ESXi.  It is not a requirement, but definitely a best practice.

PreRequisites:
  1. Powershell 2.0 +
  2. PowerCLI 5.1 +
  3. vCenter 4.1 +
  4. Local or Shared Datastore
    • Local is easy if you standardize on naming of a local datastore.  
      • I'll focus on this in my script example.
    • Shared Datastore essentially accomplishes a similar goal of a remote syslog server, you'll want to be sure to separate logs to their own individual directory.
      • Scaling may become an issue unless you focus these shared datastores among clusters rather than all hosts.
Details: