Posts

NSX-T: Release associated invalid node ID from certificate

Image
Summary: Basically had an expiring certificate registered in NSX-T that was associated to a node_id that is no longer valid.  Long story short, there wasn't anything obvious in API to delete or disassociate a certificate from a node_id for 3.2.2.  Not sure how things got in this state, but annotating for future reference.  This may change in future revisions, so always check API for latest. Details: Effectively had a stale node associated w/ a certificate that was expiring.  Could not delete certificate until that node was disassociated from the certificate. To get certificate details and associated node_id's, you can use the following curl call (UI works too): curl -k -X GET -H "Content-Type: application/json" -u admin https://<manager ip>/api/v1/trust-management/certificates/<cert UUID> Above will return something like this: Below must be run from one of the manager nodes via elevation to root: ONLY RUN THIS IF YOU ARE ABSOLUTELY SURE OF WHAT YOU ARE DOI

iOS: Sleep Focus activating on wrong time zone

Image
Time is Relative Summary: For some strange reason, my sleep focus, was activating based upon my home timezone of EST while traveling to Japan and Australia.  My phone's timezone was correct as was my apple watch that is set to mirror my iPhone. Resolution: Settings --> Privacy & Security --> Location Services Turn off Location Services Turn on Location Services For a quick test, you can edit your sleep schedule in Apple Health to a time window the same as your original timezone to see it reactivates again immediately.   For example: 9:30 pm EST = 12:30pm Brisbane So if you set your sleep schedule to 9:30pm while it is currently 12:30pm in Brisbane, sleep focus should activate immediately if above location services reset was not done. For some strange reason, I was not able to fix this by restarting iPhone.  Anyway, just thought I'd post since I'm likely to forget trying this.

Azure VMware Solution: NSX-T Active/Active T0 Edges...but

Image
Summary: Azure VMware Solution (AVS) delivers by default w/ a pair of redundant Large NSX-T Edge VM's each running a T0 in active/active mode.  So why is my traffic only going out one Edge VM? Short answer: The default T1 that is delivered w/ AVS is an active/passive T1 where you connect your workloads to.  So while it could technically take either T0, it's always going to go out the closest T0 to the active "SR" T1.  Where do the SR's live?  You guessed it, on the Edge VM's.  As you can imagine, this can lead to a bottleneck if you try to shove all your traffic through a single Edge VM. Simple Diagram: Longer answer with Options:

vCenter: Cluster Skip Quickstart Workflow via API

Image
Summary: Basically, whenever you reset vCenter, you might end up w/ a warning on a cluster running vSAN that's just annoying.  To circumvent, this from alerting, you need to disable quickstart.  Easy enough via UI, but API is a little weird here. Details: For one, code capture doesn't seem to understand this.  So no help there unfortunately.  Secondly, nothing named "quickstart" is in the API, so made this somewhat annoying to try and find.  Seems like someone had this question on the VMware communities forum 2 years ago w/ no answer.   Someone asked me internally, so I had to dig into it. Basically, two things: You can create a cluster w/ quick start disabled from the get go by passing a false boolean to a parameter named: "InHciWorkflow" via API/PowerCLI call Secondly, to "skip QuickStart" on an already created cluster, you can call a method called: "AbandonHciWorkflow" So yeah, you can see how "quickstart" and "HCIWorkfl

NSX-T: Find and Delete Orphaned Ports

Image
Summary: Basically had a bunch of orphaned ports (65000+), don't know why or how it happened (hypothetically NTP related), but needed to clean them up.  Doing it via UI was obviously not an option as it would only return 50 ports per page at a time.  Oh and it wouldn't refresh after every delete. Details: I'm saying 'orphaned', but in reality I'm only keying off the idea that the port is reporting "Operationally Down".  This could simply be a powered off VM, but there is little harm in deleting these type of ports as they will simply be recreated if that VM were to be powered up.   This may not apply in all situations, so use this with caution. Powershell Example(s): References: https://www.virten.net/2021/03/error-when-connecting-virtual-machine-to-nsx-t-segments/

vSAN: The cascade scenario that vSAN stretch cluster has issues with...

Image
Summary: Basically while testing stretch cluster, we ran into strange failover behavior.  The fact that it was not simply occuring.  During this testing, we found a dirty little secret about stretch cluster failovers.  One that makes me rethink if stretch clusters really is worth doing. Documented Failure Scenarios Details: All documented scenarios effectively deal w/ a 'single' type of failure.  The problem is disasters/failures can be multi-faceted and cascading in some instances.  Taking the Secondary Site Failure or Partitioned scenario and adding the 'cascading failure' to it and you end up in a whole world of trouble depending on the next 'failure'. Below effectively depicts the failure of the interconnect between the two sites.  The problem this fails to take into account is that there are typically 3 things involved to this.   The networking between the two sites The preferred site routers The secondary site routers So here is a slightly more involved d

NSX-T: Deleting route advertisement filters via API

Image
  Summary: When creating a DHCP server in NSX-T, a route advertisement filter is automatically created for you.  This is so that the DHCP server is prevented from advertising DHCP addresses outside of your fabric.  This is fine for the most part, but there are occasions where the DHCP subnet you allocated may overlap a DNS forwarder IP that you may have setup before. Honestly, this feels like a logic bug to where it shouldn't allow this, but oh well.   Detailed Steps Anyway, all you have to do is delete the DHCP server in question, but in some cases, the route filter may not be deleted along with it. In that case, you can delete the route filter itself via the Manager UI: Select Manager > Networking > Tier-1 Logical Routers > T1 in question > Routing > Route Advertisement > Select DHCPServerRouteFilter > Delete. In the case where the delete option is greyed out, you can use the below curl code to clear it out.  This is the last ditch effort, so only do it if yo