vEdge Tools

I am new to the vEdge and spending most my working life on the command line, I find it hard not to use it when troubleshooting. SDWAN and other technologies seem to be pushing for the self healing network, the network that can fix everything itself with little or no network knowledge for administrators.

Although this is great for routing protocols to re-converge or traffic engineering that automatically re-routes traffic, sometimes you still need to send a telnet packet on port 5568 for that Dev guy in Finance.

On the vEdge, you do have access to some installed tools. For one troubleshooting event I had recently I had to test if TACACS packets were getting to the TACACS server. I can’t use the telnet command, but on the vEdge if you type the following some nice little tools are presented.

# tools ?

Possible completions:

ike-debug

ip-route

iperf

minicom

netstat

nping

ss

stun-client

vtysh

So, for todays blog we will look into nping. These tools have been around for a long time in the linux world, so they are not new but usually you have to install them on laptops or machines to use. Here on the vEdge you have direct access to the tools.

To craft a TACACS packet, using nping, I did the following –

(Don’t forget the quotation marks!)

tools nping vpn <xxx> options “–tcp -p 49” 10.x.x.x

xxx = VPN number of source interface being used

–tcp = use TCP

-p = use Port 49

With this command I was able to send a ping to the application port and see TCP Sent and RCV packets. It contains TTL, IPLength, seq numbers and window size, including the mss.

Check out the list of all options for nping here –

https://www.cisco.com/c/en/us/td/docs/routers/sdwan/command/sdwan-cr-book/operational-cmd.html#wp5736741660

~Brad

The NeverEnding Default Route….Part 2.

So, after the IOS upgrade the route is now stable! I have taken a new packet capture and the default route is still being sent to us, but now our router won’t recalculate and enter it into table.

I have informed Cisco and waiting to see if they have an explanation. My theory now is, the ISP sending the default is some type of soft reconfiguration or route refresh issue. On our side it must have been the same, where it was constantly rechecking and entering  due to a bug.

Until Cisco get back to me, I can’t do anymore testing as I don’t have such a lab.

~Brad.

The NeverEnding Default Route….

The NeverEnding Story….whoa…whoa…whoa!

Loved that movie as a kid.

Moving on and speaking about moving on I have been made redundant at my current workplace. I have one week remaining and ever since we installed a new Internet link, I been looking at a very peculiar default route issue.

The ISP, as proven by Wireshark is sending me a update packet for the default route every two seconds. This causes our router to re-install the route into our routing table, repeatedly. The initial troubleshooting began with changing the advertisement interval of BGP to 60 seconds. So now, the default route update packet comes in every 60 seconds.

Pretty obvious it’s a carrier issue.

I spoke with colleagues, Cisco TAC, posted on message boards and everyone is of the opinion that nothing on our router could ask our BGP neighbor to send us the default every 60 seconds.

Last night, the carrier showed me a lab they had built with the exact hardware in use. It doesn’t occur in the lab. I found out the IOS they are using for our device is newer, and the config is a little light. They are only advertising a default, where the real link is advertising a full BGP table and we are filtering.

Still, got me very puzzled. I did ask for a packet capture on their side, to see if they send an update packet to other customers but they decided the lab was a better avenue to explore.

So, tonight a colleague of mine is updating the memory and upgrading the IOS.

Will the IOS fix the issue?

Will the IOS upgrade stop the BGP peer from sending default route updates?

Will Batman and Robin escape from the Joker trap?

Stay tuned.

~Brad.

Cisco Community Post –

https://community.cisco.com/t5/routing/bgp-no-keepalives-and-two-second-updates/td-p/4048584

 

Powershell, the new Telnet!

As a network engineer, you usually need to prove that ports are open across the network. One troubleshooting step when working with TCP is using Telnet and accessing the remote host with the destination port. This is a quick way to ensure end to end connectivity is allowed, and you can safely update the ticket and send it back to the user for further application layer testing.

The issue is that a lot of servers and hosts don’t run telnet anymore, as it is a insecure protocol, clear text and not best practice. Usually the application owner will then request Telnet be installed so they can do testing and then probably leave it installed. Eventually a security scan will detect it and it will be removed.

I have found, that you don’t need to install Telnet on your servers. If you have access to the Powershell terminal, you can do the exact same test and it also gives you a better response. The Telnet test usually gives you a blank command line window, which isn’t very user friendly. The powershell test gives you better feedback and can be customized to ensure you can test your TCP protocols.

Once you have access to the powershell terminal, you can use the following command –

Test-NetConnection x.x.x.x -Port y

x.x.x.x = IP address or domain name

y = Port Number

The response is as follows –

PS C:\Users\ciscoworkerbee> Test-NetConnection google.com -Port 443

ComputerName : google.com
RemoteAddress : 216.58.196.142
RemotePort : 443
InterfaceAlias : Ethernet 2
SourceAddress : 10.10.10.10
PingSucceeded : True
PingReplyDetails (RTT) : 1 ms
TcpTestSucceeded : True

Here is the Microsoft page for more information.

https://docs.microsoft.com/en-us/powershell/module/nettcpip/test-netconnection?view=win10-ps

Happy troubleshooting.

~Brad.

 

 

Real World vs Exams – 4500 Chassis

First, you don’t know what device you are on in the Cisco exam. It’s just a simulator and testing your knowledge of the command line, not the specific hardware.  I recently had a module failure in a Cisco 4500 chassis, and part of the troubleshooting reminded me of doing a Cisco exam.

Let me explain.

The year was 2020, the month was January and work had just started back for the year. I had logged in, checking emails and the incident queue when I get a message appear on my google chat window. In the real world, people don’t log tickets they just remember you helped them long ago and reach out directly. It doesn’t matter how many times you say Service Desk they just come straight to you. I won’t say no or ‘ log a ticket’ because someone is in need and this person was in the IT department, so if you do him or her a favour then you will get one back. That is the way it should work.

The messages said wireless connectivity was impacted and some desk ports were not working. We also got some messages from our monitoring that some ports had gone offline.

I still had connectivity to the site and after logging in I found the following in the logs –

Feb 3 09:31:06 AEST: %C4K_HWPORTMAN-3-SUPERPORTMACLINKDOWN: Superport Mac link down on Superport 20 on slot 7.

Feb 3 09:31:06 AEST: %C4K_HWPORTMAN-3-SUPERPORTMACLINKDOWN: Superport Mac link down on Superport 21 on slot 7.

Feb 3 09:31:06 AEST: %C4K_HWPORTMAN-3-SUPERPORTMACLINKDOWN: Superport Mac link down on Superport 32 on slot 7.

Feb 3 09:31:06 AEST: %C4K_HWPORTMAN-3-SUPERPORTMACLINKDOWN: Superport Mac link down on Superport 33 on slot 7.

I took a ‘show tech’ and then I power cycled the module, which you can do on a 4500 chassis.

The module came back up and connectivity was restored, for about 12 hours. I was on call that night and got a message about midnight that ports had failed again. I reset the module and then went back to sleep.

By this time, I had logged a case to Cisco to check for bugs. That morning it happened again, although this time when I reset the module it never came back.

So far, this is all real world. TAC case logged, new module now shipped and then we do a swap out (under change control of course).

The new module did not work.

Not the best message to see –

Feb 5 14:55:03 AEST: %C4K_CHASSIS-3-LINECARDSEEPROMREADFAILED: Failed to read module 7’s serial eeprom, try reinserting module

It is now day two and site has a workaround for people connected to this module. The bugs and cases online point to a possible chassis fault. A new module and new chassis were sent to site.

Replacing the chassis is not an easy task, but we had remote hands to do this for us as this site was about 1000 kms away from me.

To ensure nothing was wrong with the second module, we first swapped a working module, so module 1 to module 7. Still the same message that it failed to read module 7.

It was chassis replacement time.

It took them about an hour to replace the chassis and plug all cables in (the cabling was very neat) and this is where it started to feel more like a Cisco exam.

The site had the 4500 connected to two 3750 switches, with EIGRP neighbors for connectivity. Two / 30 Layer 3 interconnects were provisioned between the devices for load balancing and I had connectivity to the 3750 via BGP.

I was on the 3750 switch, and it had been enough time for the chassis to power up and EIGRP neighbors to form, but I saw nothing.

So, check Layer 1 and it’s all good, ports are up. Check layer 2, CDP neighbors and I can see the device, its hostname and IP addressing. Configuration was loaded but I had no EIGRP?

I had connectivity from local site to the 4500, but I could not access the 4500 from the WAN. The default route was not in the 4500, as it was coming from BGP, redistributed to EIGRP and no floating static was used on the local site.

Time to ssh directly to the layer 3 interconnect IP and see what is going on. I was able to login with local credentials and found that EIGRP config was not even present in the configuration. Before I continue troubleshooting, I needed to get the site operational, so I deployed a floating static (static route with high AD) and it had to be higher than external EIGRP AD of 170.

On the 3750 I deployed a static route to the site summary address; this was then redistributed into BGP so the WAN could get there. So, for now connectivity restored and all modules were online.

So, why didn’t EIGRP show up in the configuration?

I found that due to the chassis replacement, the serial change causes the licence to fall back to the default licence. It had no layer 3 routing protocols supported so it didn’t load the EIGRP config. The licence file is linked to chassis serial and I needed a new one!

Once I had the new licence, I applied it rebooted during an outage window. I applied the EIGRP config and the external EIGRP default route was placed into the routing table.

So, in summary the real world is very physical, you must be aware of the hardware and how it operates.  The exam world will teach you the commands and theory, but it is real world experience that really makes you a network engineer and can set you apart from others.

In this example, not only did I apply some commands I had to deal with onsite engineers, troubleshoot hardware, perform diagnostic testing, liaise with third parties for equipment, organise change windows, log changes, seek approvals, copy current config and status, check and test after replacing equipment, apply licences and document.

There is not enough time in an exam to do all that 🙂

~Brad.

New Study Begins – Interested?

The first new study of the year will be Python. I have found this course below via Reddit which is free and directed at network engineers, which is handy for me.

This should really start the ball rolling for the CCIE Lab exam, as you are expected to automate during the lab.

If you are looking for something free and in IT why not give it a go?

~Brad.

Cisco Certification Updates

So, if you are on my LinkedIn you may have seen a lot of new certifications verified to myself via the Acclaim website.

Cisco have now retired two certifications and modified my existing ones, so no more CCDA or CCDP.

I have now been given the following –

CCNA
CCNP Enterprise (CCNP-Enterprise)
Cisco Certified Specialist – Enterprise Advanced Infrastructure Implementation (CCS-EAII)
Cisco Certified Specialist – Enterprise Core (CCS-ECore)
Cisco Certified Specialist – Enterprise Design (CCS-ED)

The best news is that the CCIE Written is now no longer, so I failed that exam three times and don’t have to do it anymore. I can now go direct to the Cisco CCIE LAB!

This is exciting and also scary at the same time. The CCIE Lab will be updated in April and I think it’s going to take me a year to prepare for this exam. I want to master every subject, but some of the new exam modules might be difficult, like Viptela and SDN.

Very difficult to lab at home.

Stay tuned as I ponder how to attack this challenge.

 

~Brad.

 

 

Disable weak ciphers

Recently I was asked to disable weak ciphers for SSH. I actually assumed I would need some type of code upgrade.

I decided to do a ‘show run | i ssh ‘ to see if anything was configurable in my switch.

The standard config appeared, enabling server etc but nothing else. I decided to run the ‘ show run all | i ssh ‘ command. This command shows your configuration plus all the default configuration of the switch.

I discovered a command –

ip ssh server algorithm

&

ip ssh client algorithm

You can actually reapply this command without the encryption, so for disabling 3des-CBC I applied the following –

ip ssh server algorithm encryption aes128-ctr aes192-ctr aes256-ctr aes128-cbc aes192-cbc aes256-cbc

ip ssh client algorithm encryption aes128-ctr aes192-ctr aes256-ctr aes128-cbc aes192-cbc aes256-cbc

I am now waiting for a new scan to be completed to confirm, but pretty confident it has been disabled.

~Brad.

Embedded Packet Capture

I just love this tool that Cisco provides in new switches, the Embedded Packet Capture.

I was just working on a task with an AV colleague, he has inherited some equipment and didn’t know the IP address but did know the switchport it was patched into. I thought too easy, I’ll jump on and get the mac from the port, check the arp table on the Layer 3 Core switch and boom.

Seems, there was a MAC, but no ARP entry? I suspected the device hasn’t spoken to its gateway for sometime and timed out. I initiated a broadcast ping, so ping the broadcast IP of the subnet to force all hosts to respond.

ping 10.1.1.255 ( Subnet 10.1.1.0/24)

I got some hits of IPs within the subnet and even some IPs that people have assigned that are not routable within the VLAN as well.

I tried the arp command again and saw nothing for the MAC I have. I then decided, lets reload the device, force it to broadcast to its default gateway and talk with its fellow devices, still nothing.

The final thought was, ok I guess it has never been given an IP or it has lost its IP.

My colleague suggested he install wireshark and take a capture from the device. That would take too long, why not do it on the switch itself!

This EPC feature allows me to take a very quick capture on the port and store it in memory or in a file on the switch. I can then open it on the switch or copy to my local machine.

I ran it for only a few seconds and intercepted 3 packets.

Starting the packet display …….. Press Ctrl + Shift + 6 to exit

1 0.000000 0.0.0.0 -> 255.255.255.255 DHCP 342 DHCP Discover –
2 0.000021 0.0.0.0 -> 255.255.255.255 DHCP 342 DHCP Discover –
3 0.000036 0.0.0.0 -> 255.255.255.255 DHCP 342 DHCP Discover –

Hello Mr DHCP. Looks like it either lost its configuration or has never had it! This subnet is also supposed to be static assignment only. Before the Embedded Packet Capture and one of the first tasks I was assigned when I went to my first enterprise environment was the Port Mirroring feature.

Copy everything from this port (source) and dump it to this port (destination), attach a laptop with wireshark and capture. This is still handy for very big captures, captures that require traffic after they have been altered (QoS) or auditing and EPC should only be used for small captures and filtered so it does not affect the switch in anyway.

It was perfect for this scenario.

The syntax and documentation is here –

https://www.cisco.com/c/en/us/support/docs/ios-nx-os-software/ios-embedded-packet-capture/116045-productconfig-epc-00.html

~Brad.