Network Automation In The Field

Has virtual Networking met its match? I mentioned last week that network automation holds huge potential for network architects. It gives you control of not only the normal behavior in your environment, but it extends this to allow you to check configurations before they are deployed, monitors changes in the network and take corrective action, and in the case of our test, gathers important statistics and detailed information and sends it to you for further analysis.

Jeremy Littlejohn

June 15, 2010

5 Min Read
NetworkComputing logo in a gray background | NetworkComputing

Has virtual Networking met its match? I mentioned last week  that network automation holds huge potential for network architects. It gives you control of not only the normal behavior in your environment, but it extends this to allow you to check configurations before they are deployed, monitors changes in the network and take corrective action, and in the case of our test, gathers important statistics and detailed information and sends it to you for further analysis.

I believe that there are two types of automation currently being discussed by the trade press but that both are called network automation. First is workflow automation. This is very important and is represented by the network's ability to detect and adapt to changes that are implemented in other applications such as when an administrator provisions and starts a new VM on a specific VLAN. At Interop, Force10 showed how its switching platform can automatically detect a VLAN change on a VMware Hypervisor through the VMware vSphere API, running in perl, on the Force10 switch. It can then change the VLAN on the switch to match this change. What's interesting in this example is that it removes a set of steps that an administrator has to go through in setting up a VM on a particular VLAN. You already defined the VLAN in the VM configration, the network should figure out the rest. Most other network infrastructure vendors have integrated thier switches with VMware's vCenter to perform similar functions.

The second type of network automation is automating behavior that runs without any human interaction. You could call it behavior automation. In workflow automation, we need to have a human start the workflow and then the network adapts. Behavior automation takes the human out of the picture. OSPF or BGP route convergence is an example of behavior automation. But those are standards-based behaviors, and while you can tweak them a good bit, there is a limit to their capabilities. Another example is when a swtich using Cisco's Discovery Protocol (CDP) or the IEEE Link Layer Discovery Protocol (LLDP) discovers a VoIP phone and configures the switch port for PoE, a voice VLAN, and applies QoS policies on the traffic. These are baked-in automation features that you turn on or off as needed, but these only scratch the surface of what you can automate.

Scripting is different. With scripting, a switch can be told to watch for behaviors that are in the network or on hosts and then take some action. For example, if a script notices that bits per second drops below a define threshold, which may indicate broken connection, the switch could change to a backup link or take some other corrective action.

Does network automation deliver on all of this? Yes. Today, with modern switches, the capability to write scripts on the switch or on a management station opens up a wealth of opportunities to automate common actions based on well known events. We dipped our toe in the water by using a script to capture packets when utilization on a specific port reached 75 percent. We used Cisco equipment in our example. The features and functions available to you depends on your switch vendor.
 
The first requirement is to get the EEM scripts configured and installed. There are two pieces to this setup. First is the environment itself, similar to the operating system on a computer, which amounts to the IOS running on the router. You will need recent IOS and there are some environmental variables that need to be configured. The second piece is the script. The scripts are actually TCL (Tool Command Language) files that are loaded into the device's flash and can be referenced in the IOS configuration. You can use pre-canned scripts (which we did for this test) or write your own in TCL. For this blog, I chose an automation scenario to start a packet capture when network utilization spikes on a specific interface. In my practice, getting packet captures based on events is a pretty common need. We tested features in Cisco's Embedded Event Manager. Other vendors like Force10 and Juniper also have scripting and automation environments that are being built into their equipment. Force10, for example, is taking a broader approach and looking to let engineers develop in many different code bases, like PHP or Perl. The language itself will not ultimately decide who wins or loses in the network automation arena. The features and functions that vendors expose to automation, will be a deciding factor.

Frankly, the installation is a little difficult. Cisco provides an easy-installer.tcl program that is supposed to set most things up for you. It did not. In fact, I tried two different IOS releases (15.0 and 12.4(24)T) and the most recent versions of all scripts and had no success with easy-installer. I did have success manually setting it up, and now that it is done, in retrospect, it was something that could be repeated pretty easily.

For the test, we set up a watchdog that would look at the load percentage on a particular interface every 30 seconds. If it got above 75 percent, we had the watchdog fire off a TCL script, which we downloaded from Cisco's website, that starts a packet capture for that interface automatically. It stopped collecting after 30 seconds went by or when the router ran out of the buffer space we allocated to it. This worked perfectly. We simulated traffic over the link to push the load and about 15 seconds later I had a .pcap file sitting in my TFTP directory to review.

The main drawbacks were the small maximum buffer size (1024 Kbytes) and the fact that if you wanted to know what was eating up 75 percent  of the bandwidth on your interface, you don't want the TFTP copy of the .pcap file to take up the remaining 25 percent when it is transferred to your central TFTP server. For the TFTP transfer, you could certainly write a delay into the export TCL script to avoid this. On the buffer size, you can do filters and limit the actual data stored from each packet, but you may still find it to be a limiting factor.

Overall, the automation worked very well, and our team was impressed. It was a small example that solved a very real problem for us and is just the first step along a path to leveraging automation more in our production networks.

Read more about:

2010
SUBSCRIBE TO OUR NEWSLETTER
Stay informed! Sign up to get expert advice and insight delivered direct to your inbox

You May Also Like


More Insights