I received a question recently from Anthony Padula that asked if SAN Port Channels were supported on a Nexus 5000 running in NPV mode. After speaking with a few colleagues, locating the documentation, and setting it up in the lab, I am happy to say that the answer is yes!
However, as usual, I found the documentation to be somewhat confusing and as a result, in this post I’m going to share the topology, caveats, and configuration steps that I used as well as the output from some helpful show commands. I’m also going to include some information on the link failover testing that I performed.
Topology
The topology that was configured is shown in the following diagram.
The 9148 was running 5.0(1a) and the Nexus switches were all running 5.0(2)N1(1).
Caveats
- On the 9148, the “san-port-channel” interface is referred to a “port-channel”. This can be confusing when configuring a SAN Port Channel between a Nexus 5k and an MDS.
- The interface defaults for trunk mode are different on a Nexus 5k running in NPV mode than they are for a Nexus 5k running in FC-SW mode. As a result, when you do a side-by-side comparison of the interfaces using “show run interface fc2/x”, the configurations will appear to be different. Instead, use “show interface fc2/x” which allows you to determine how the port is actually configured and is not dependent on the default interface settings.
Setup steps
The tables that follow provide a list of commands to use on each switch. If you follow the steps in order, the SAN Port Channel should come up and not require additional configuration.
SAN Port Channel 100 (Spo100)
To configure Spo100, both the 9148 and the 5020 need to be configured as follows:
SAN Port Channel 200 (Spo200)
To configure Spo200, both the 5020 and 5548-2 need to be configured as follows:
SAN Port Channel 256 (Spo256)
To configure Spo200, both the 5020 and 5548-2 need to be configured as follows:
Output from the show interface brief command
When the commands above have been all been entered, the output from the show interface brief should appear as follows.
Note: The output has been truncated.
MDS-9148
Nexus-5020
Nexus-5548-1
Nexus-5548-2
Observations from testing
Once the SAN Port Channels were up, I checked connectivity to storage using INQ and started I/O using COPA. In case you don’t know, INQ and COPA are two of EMC’s proprietary tools that are regularly used in the E-Lab. Once I verified there weren’t any basic errors on the hosts or storage ports, I decided to run a couple of tests to see what kind of an impact the SAN Port Channel feature has on connectivity during a few failure scenarios.
During each test, I used an xgig analyzer to monitor a couple of the CNAs that were using the SAN Port Channel to access storage. Since I was physically pulling cables and expected many frames to be dropped during each test, I was more concerned with measuring the amount of time a host lost connectivity to storage, rather than the number of frames dropped while pulling a cable. As a result, the analyzer was primarily being used to monitor the CNA’s throughput rather than capture, analyze, and characterize the disruption (e.g., number of frames dropped, etc). Also, in all tests, the amount of bandwidth being used was less than the bandwidth available on any single member of the SAN Port Channel. This was done to prevent the link from becoming oversubscribed during the testing and allowing congestion and backpressure to skew the test results. For more information about congestion and backpressure, refer to the Networked Storage Concepts and Protocols TechBook.
Single Link failover / failback
With I/O running, I physically pulled the cable out of one of the switch interfaces that was a member of the SAN Port Channel. I then reinserted the cable and after waiting for a few seconds, I pulled the other member of the SAN Port Channel and then pushed it back in. I repeated this test a few times on all three SAN Port Channel configurations shown in the diagram above. In all cases, the disruption was negligible and I saw virtually no drop in throughput via the xgig interface. In addition, I was very happy with the improvement in behavior between the Nexus running in NPV mode and the 5020. Previously, when removing one of the links between a switch running in NPV mode and its “core” switch, a number of the hosts would be logged out and then allowed to re-login via a different uplink to the core switch. With the SAN Port Channel, as long as one of the member ports was active at all times, none of the hosts on the NPV switch were impacted.
All links in SAN Port Channel pull / push
In this test, I physically pulled the cables out of all of the switch interfaces that were members of the SAN Port Channel. As expected, while the cables were disconnected, all of the hosts using the SAN Port Channel lost connectivity to storage. The interesting thing is what happened when I reconnected the cables. The switch-to-switch TE_Port to TE_Port connections took about 20 seconds for the link to come back up. This is not unexpected since an approximate 20-second delay is built into the FC-SW standard when bringing up an ISL. If you are interested in more information about this delay, you can check out the Networked Storage Concepts and Protocols TechBook. Surprisingly, the connection from the 5548 running in NPV mode to the 5020 took a similar amount of time to come up all the way. I haven’t had a chance to dig into the reason for this yet, but I thought it would have been much shorter.
Conclusion
Overall, I was impressed with the SAN Port Channel functionality and would use it in my environment, especially if I had switches in NPV mode. I didn’t perform enough testing to be able to comment too much on the load balancing aspect of the feature, but I did notice that the load appeared to be equally distributed across both links. As I continue to test with the feature, I’ll dedicate some time to giving you all updates.
As always, thank you for reading!
Thanks for information Erik. Sounds like everything worked as it should, but nice to see verification of it in writing from a source I trust.
Posted by: Chris Carter | 12/20/2010 at 12:32 PM
Very nice article Erik! Do you think NP SAN PortChannel is also possible with a Brocade upstream switch?
Posted by: Sam | 03/07/2011 at 05:39 AM
Thanks! No, I don't believe an NP Port-Channel would work with a Brocade upstream switch. That having been said I haven't actually tried it...
Posted by: Erik Smith | 03/07/2011 at 06:39 AM
Errata: I'm not sure why (yet) but while I was configuring this recently on a new 5596, I needed to add force when adding the channel-group to a set of interfaces.
The exact issue is in the "SAN Port Channel 200 (Spo200)" section. Under the steps for configuring 5548-2, the line "channel-group 200" should be "channel-group 200 force".
Posted by: Erik Smith | 05/17/2011 at 02:03 PM
Excellent post.
On N5k side, do you have to configure the switch to either NPV mode of FC-SW mode? There by restricting the switch to either act as a SAN switch or a traditional ethernet switch?
I am wondering if one switch can be used for both your storage traffic and Ethernet traffic.
Posted by: kumar | 10/01/2011 at 08:53 AM
Hi Kumar, Thanks!
You can use either NPV mode or FC-SW mode to support FCoE. Non-FCoE traffic will work the same in either of these modes.
On UCS, I believe there is also a feature that allows you to put the non-FCoE side of the Ethernet switch into either an NPV mode or a switch mode...
Posted by: Erik Smith | 10/01/2011 at 02:56 PM