It seems that every month brings news of yet another overlay network. STT, OTV, VXLAN, NVGRE and may be many more I've not heard of. People often use "ships in the night" words in the same sentence referring to the fact that these overlay network know nothing about underlying physical network infrastructure. However, nowadays even the smallest sea or ocean going vessel not only knows about any ship nearby, but acutely aware of a few satellites thanks to radar and GPS systems.
So, to keep up with pace of time and advances in technologies (after all, we are in tech business) I propose new analogy for overlay and physical networks: polar bear and penguin. Those 2 definitely do not meet in real life. I am not considering corner case like zoo.
Musings about various system administration and network projects I am working on. Lab use only.
Val:~$ whoami
I am Val Glinskiy, network engineer specializing in data center networks. TIME magazine selected me as Person of the Year in 2006.
Search This Blog
Tuesday, September 10, 2013
Wednesday, September 04, 2013
Rant
Any vendor requiring registration in order to access online documentation should be condemned to untangle network cables for the rest of the their product life.
Wednesday, August 07, 2013
Friday, July 19, 2013
LACP timer and what it means
LACP (IEEE 802.3ad)is protocol used to bundle several physical interfaces to form single logical channel. It has a timer which defines how often devices inter-connected via this bundle exchange LACP PDUs or control messages. Currently, this timer can be set to either "rate fast" - 1 second, or "rate normal" - 30 seconds. What is not always clear is that when you configure "lacp rate " on Cisco or "set interfaces ae1 aggregated-ether-options lacp periodic fast" on Juniper, you do not configure how often this switch will send LACP PDUs. This command means that switch where this command is applied will expect to receive LACP PDUs with this frequency from the partner on the other side of logical channel.
Here is quick test. I have Nexus5500 connected to Cat6500. Let's configure port-channel between them with one physical member interface.
Cat6500#show run interface TenGigabitEthernet 1/5
!
interface TenGigabitEthernet1/5
switchport
switchport trunk encapsulation dot1q
switchport mode trunk
lacp rate fast
channel-group 5 mode active
Nexus5500# show running-config interface Ethernet 1/1
!
interface Ethernet1/1
switchport mode trunk
channel-group 1 mode active
"lacp rate normal" is default setting on Nexus, so this command does not show up in the output, but we can confirm:
Nexus5500# show running-config interface Ethernet 1/1 all | include lacp
lacp port-priority 32768
lacp rate normal
Cat6500 is configured with rate fast and Nexus5500 - with rate normal. Let's see what's going on behind the scene.
On Catalyst:
Cat6500#show lacp internal
Flags: S - Device is requesting Slow LACPDUs
F - Device is requesting Fast LACPDUs
A - Device is in Active mode P - Device is in Passive mode
Channel group 3
LACP port Admin Oper Port Port
Port Flags State Priority Key Key Number State
Te1/5 FA bndl 32768 0x3 0x3 0x106 0x3F
F flags says that Cat6500 requesting fast LACP PDUs from its partner.
On Nexus it's a little bit backwards, the "show" command tells you partner status, not its own.
Nexus5500# show lacp neighbor interface port-channel 1
Flags: S - Device is sending Slow LACPDUs F - Device is sending Fast LACPDUs
A - Device is in Active mode P - Device is in Passive mode
port-channel1 neighbors
Partner's information
Partner Partner Partner
Port System ID Port Number Age Flags
Eth1/1 20,0-13-5f-20-63-80 0x106 910 SA
LACP Partner Partner Partner
Port Priority Oper Key Port State
32768 0x3 0x3f
Cat6500#show run interface TenGigabitEthernet 1/5
!
interface TenGigabitEthernet1/5
switchport
switchport trunk encapsulation dot1q
switchport mode trunk
lacp rate fast
channel-group 5 mode active
Nexus5500# show running-config interface Ethernet 1/1
!
interface Ethernet1/1
switchport mode trunk
channel-group 1 mode active
"lacp rate normal" is default setting on Nexus, so this command does not show up in the output, but we can confirm:
Nexus5500# show running-config interface Ethernet 1/1 all | include lacp
lacp port-priority 32768
lacp rate normal
Cat6500 is configured with rate fast and Nexus5500 - with rate normal. Let's see what's going on behind the scene.
On Catalyst:
Cat6500#show lacp internal
Flags: S - Device is requesting Slow LACPDUs
F - Device is requesting Fast LACPDUs
A - Device is in Active mode P - Device is in Passive mode
Channel group 3
LACP port Admin Oper Port Port
Port Flags State Priority Key Key Number State
Te1/5 FA bndl 32768 0x3 0x3 0x106 0x3F
F flags says that Cat6500 requesting fast LACP PDUs from its partner.
On Nexus it's a little bit backwards, the "show" command tells you partner status, not its own.
Nexus5500# show lacp neighbor interface port-channel 1
Flags: S - Device is sending Slow LACPDUs F - Device is sending Fast LACPDUs
A - Device is in Active mode P - Device is in Passive mode
port-channel1 neighbors
Partner's information
Partner Partner Partner
Port System ID Port Number Age Flags
Eth1/1 20,0-13-5f-20-63-80 0x106 910 SA
LACP Partner Partner Partner
Port Priority Oper Key Port State
32768 0x3 0x3f
Nexus5500 says, that its partner - Cat6500 - is sending LACP PDUs every 30 seconds.
Monday, July 15, 2013
Find MAC address for IPv4 Multicast group
When troubleshooting multicast problem I find myself checking if IGMP snooping works as intended. "show mac address-table multicast" on Cisco switches shows MAC addresses of multicast groups. Tired of converting Multicast IPs into MACs with pencil and paper, I wrote my first ever script in Python which does just that. It takes IP address of multicast group as a parameter. Although I did some testing there might be bugs, so beware.
See Cisco's white paper for explanation how the conversion is done.
See Cisco's white paper for explanation how the conversion is done.
Saturday, July 13, 2013
The wait is over
Finally, in NX-OS 6.0(2) for Nexus 5000 platform Cisco implemented "default interface" command which lets you return interface to its factory default configuration. It is very-very-very useful feature in the lab environment when one has to do a lot of re-configuration and something does not work as expected simply because of left-over configuration from the previous test.
This command has been available in IOS since 11.1 and in NX-OS for Nexus 7K since 5.1(1)
Wednesday, May 29, 2013
Juniper haiku
No wonder Juniper software is so bloated :).
admin@switch> show version and haiku
Hostname: switch
Model: ex4200-48t
JUNOS Base OS boot [10.4R12.1]
JUNOS Base OS Software Suite [10.4R12.1]
JUNOS Kernel Software Suite [10.4R12.1]
JUNOS Crypto Software Suite [10.4R12.1]
JUNOS Online Documentation [10.4R12.1]
JUNOS Enterprise Software Suite [10.4R12.1]
JUNOS Packet Forwarding Engine Enterprise Software Suite [10.4R12.1]
JUNOS Routing Software Suite [10.4R12.1]
JUNOS Web Management [10.4R12.1]
Now that Zion's safe,
can they find a better place
to get some sweaters?
admin@switch> show version and haiku
Juniper babies
The next generation starts
Gotta get more sleep
admin@switch> show version and haiku
3am; darkness;
Maintenance window closing.
Safety net: rollback.
admin@switch> show version and haiku
Shiny leather pants
Why don't they squeak when they kick?
Is _that_ the secret?
admin@switch> show version and haiku
Hostname: switch
Model: ex4200-48t
JUNOS Base OS boot [10.4R12.1]
JUNOS Base OS Software Suite [10.4R12.1]
JUNOS Kernel Software Suite [10.4R12.1]
JUNOS Crypto Software Suite [10.4R12.1]
JUNOS Online Documentation [10.4R12.1]
JUNOS Enterprise Software Suite [10.4R12.1]
JUNOS Packet Forwarding Engine Enterprise Software Suite [10.4R12.1]
JUNOS Routing Software Suite [10.4R12.1]
JUNOS Web Management [10.4R12.1]
Now that Zion's safe,
can they find a better place
to get some sweaters?
admin@switch> show version and haiku
Juniper babies
The next generation starts
Gotta get more sleep
admin@switch> show version and haiku
3am; darkness;
Maintenance window closing.
Safety net: rollback.
admin@switch> show version and haiku
Why don't they squeak when they kick?
Is _that_ the secret?
Saturday, April 20, 2013
PACL and MAC address learning
I have unenviable task to drag legacy application to 21st century. I am talking about 80-the legacy and some of its functions do not even use IP protocols. One of the proposed solution included 2 servers with the same
IP and MAC addresses (I know, but splitting network in 2 separate VLANs was not an option) connected to different switches, but in the same VLAN. ClientA and ClientB should be able to talk to each other and server connected to the same switch as client. ServerA and ServerB should not even know about each other's existence, so they won't complain about duplicate IP address. The switches are Cisco 6500s. One of the obvious solutions is to put Port Access Control List on either side of the inter-switch link. PACL successfully blocked the traffic between servers, but switches still learned MAC address of the blocked traffic source and placed it MAC address table. This would cause MAC address flapping on the switch every time clients send ARP query for server's MAC or when both servers need to send traffic to their clients. Why would switch need to keep MAC address of the discarded traffic? Oh, well. Another network mystery.
IP and MAC addresses (I know, but splitting network in 2 separate VLANs was not an option) connected to different switches, but in the same VLAN. ClientA and ClientB should be able to talk to each other and server connected to the same switch as client. ServerA and ServerB should not even know about each other's existence, so they won't complain about duplicate IP address. The switches are Cisco 6500s. One of the obvious solutions is to put Port Access Control List on either side of the inter-switch link. PACL successfully blocked the traffic between servers, but switches still learned MAC address of the blocked traffic source and placed it MAC address table. This would cause MAC address flapping on the switch every time clients send ARP query for server's MAC or when both servers need to send traffic to their clients. Why would switch need to keep MAC address of the discarded traffic? Oh, well. Another network mystery.
Friday, April 12, 2013
"private-vlan syncronize" pitfall
Let's say you need to configure Private VLAN on Cisco Nexus switch with MSTP. You create new VLAN, make it isolated and map it to primary VLAN. Your newly created VLAN is automatically mapped to MST0 and if your primary happened to be in any other MST instance there is a possibility that your primary and secondary VLANs ended up with different L2 paths. As a bonus, you get
annoying message "These secondary vlans are not mapped to the same instance as their primary" every time you run "show spanning-tree configuration". Not a big deal, but things like these irritate me. Not to worry, "private-vlan syncronize" under "spanning-tree mst configuration" will automatically map all secondary VLANs to the same MST instance as primary VLAN. The moment it's done, you MST digest changes and boom, you have brand new MST region and STP convergence on top of it. So, either pick exiting VLAN mapped to the same instance and convert to secondary community or isolated VLAN, or have all your VLANs mapped to MST0 and use other methods to load-share traffic between transit links.
I know what you are thinking. No, I did it in the lab, so you won't have to.
annoying message "These secondary vlans are not mapped to the same instance as their primary" every time you run "show spanning-tree configuration". Not a big deal, but things like these irritate me. Not to worry, "private-vlan syncronize" under "spanning-tree mst configuration" will automatically map all secondary VLANs to the same MST instance as primary VLAN. The moment it's done, you MST digest changes and boom, you have brand new MST region and STP convergence on top of it. So, either pick exiting VLAN mapped to the same instance and convert to secondary community or isolated VLAN, or have all your VLANs mapped to MST0 and use other methods to load-share traffic between transit links.
I know what you are thinking. No, I did it in the lab, so you won't have to.
Tuesday, March 05, 2013
How complex systems fail
A must read paper for every system architect. Also recording of the talk given at Velocity conference by Richard Cook. The main take-away for me is: don't build reliable system, build resilient system.
Monday, February 11, 2013
Port-security side effect
I discovered interesting side effect of configuring port-security which can prevent its deployment in certain circumstances. Let's have a look. Here is part of pertaining interface configuration:
switchport port-security maximum 20
switchport port-security
switchport port-security aging time 1440
switchport port-security violation restrict
switchport port-security aging type inactivity
You can find explanations of what each command does here.
switch# show port-security interface gi 0/45
Port Security : Enabled
Port Status : Secure-up
Violation Mode : Restrict
Aging Time : 1440 mins
Aging Type : Inactivity
SecureStatic Address Aging : Disabled
Maximum MAC Addresses : 20
Total MAC Addresses : 1
Configured MAC Addresses : 0
Sticky MAC Addresses : 0
Last Source Address:Vlan : abcd.ef12.3456:1234
Security Violation Count : 0
In case you were wondering, the MAC address above is completely made up, but it associated with "floating" IP assigned to active device in cluster. Let's see what happens when active IP has to be moved to other device in cluster due to fail-over:
port_security-2-psecure_violation: security violation occurred, caused by mac address abcd.ef12.3456 on port gigabitethernet0/46.
Oops. Even though port-security configuration allows to learn up to 20 MAC addresses and there are no MAC addresses on Gi0/46, we got port-security violation. Why? Let's see debug port-security:
PSECURE: psecure_add_addr_check: Found duplicate mac-address abcd.ef12.3456, It is already secured on Gi0/45
%PORT_SECURITY-2-PSECURE_VIOLATION: Security violation occurred, caused by MAC address abcd.ef12.3456 on port GigabitEthernet0/46.
PSECURE: Security violation, TrapCount:346
PSECURE: Read:2830, Write:2831
PSECURE: swidb = GigabitEthernet0/46 mac_addr = abcd.ef12.3456 vlanid = 1234
PSECURE: Adding abcd.ef12.3456 as dynamic on port Gi0/46 for vlan 1234
PSECURE: Violation/duplicate detected upon receiving abcd.ef12.3456 on vlan 1234: port_num_addrs 1 port_max_addrs 20 vlan_addr_ct 1: vlan_addr_max 20 total_addrs 4: max_total_addrs 6144
Port-security violation happened because MAC address has not been deleted from original port yet, hence "duplicate mac-address" message. To mitigate, but not completely alleviate the problem, we can reduce aging timer to 1 minute minimum. It still means that in case of fail-over, the floating IP address will not be accessible for another minute, which could be 1 minute too long.
switchport port-security maximum 20
switchport port-security
switchport port-security aging time 1440
switchport port-security violation restrict
switchport port-security aging type inactivity
You can find explanations of what each command does here.
switch# show port-security interface gi 0/45
Port Security : Enabled
Port Status : Secure-up
Violation Mode : Restrict
Aging Time : 1440 mins
Aging Type : Inactivity
SecureStatic Address Aging : Disabled
Maximum MAC Addresses : 20
Total MAC Addresses : 1
Configured MAC Addresses : 0
Sticky MAC Addresses : 0
Last Source Address:Vlan : abcd.ef12.3456:1234
Security Violation Count : 0
In case you were wondering, the MAC address above is completely made up, but it associated with "floating" IP assigned to active device in cluster. Let's see what happens when active IP has to be moved to other device in cluster due to fail-over:
port_security-2-psecure_violation: security violation occurred, caused by mac address abcd.ef12.3456 on port gigabitethernet0/46.
Oops. Even though port-security configuration allows to learn up to 20 MAC addresses and there are no MAC addresses on Gi0/46, we got port-security violation. Why? Let's see debug port-security:
PSECURE: psecure_add_addr_check: Found duplicate mac-address abcd.ef12.3456, It is already secured on Gi0/45
%PORT_SECURITY-2-PSECURE_VIOLATION: Security violation occurred, caused by MAC address abcd.ef12.3456 on port GigabitEthernet0/46.
PSECURE: Security violation, TrapCount:346
PSECURE: Read:2830, Write:2831
PSECURE: swidb = GigabitEthernet0/46 mac_addr = abcd.ef12.3456 vlanid = 1234
PSECURE: Adding abcd.ef12.3456 as dynamic on port Gi0/46 for vlan 1234
PSECURE: Violation/duplicate detected upon receiving abcd.ef12.3456 on vlan 1234: port_num_addrs 1 port_max_addrs 20 vlan_addr_ct 1: vlan_addr_max 20 total_addrs 4: max_total_addrs 6144
Port-security violation happened because MAC address has not been deleted from original port yet, hence "duplicate mac-address" message. To mitigate, but not completely alleviate the problem, we can reduce aging timer to 1 minute minimum. It still means that in case of fail-over, the floating IP address will not be accessible for another minute, which could be 1 minute too long.
Subscribe to:
Posts (Atom)