Building a Culture of Automation in Network Operations

A cultural shift towards automation can improve operational efficiency, reduce mean time to recovery, and reduce the risk of service delivery issues.

5 Min Read
A cultural shift towards automation can improve operational efficiency, reduce mean time to recovery, and reduce the risk of service delivery issues.
(Credit: Prostock-studio / Alamy Stock Photo)

Network Operations teams are caught between a rock and a hard place. They're set to difficult tasks like preventing network outages, implementing security policies, and maintaining delivery of many important applications and services, all while staying on top of troubleshooting. To make the challenge worse, NetOps processes have changed very little in the last 30 years, but networks have changed a lot! The rise of software as a service, remote work, and video meetings, widespread use of IoT devices, virtualization, and software-defined networking have all made networks more complex and the job of Network Operations more difficult. But NetOps budgets and teams have largely stayed the same size.

Old processes and new networks

Traditional NetOps processes are no longer sufficient for modern networks. These processes are still quite manual - according to the Gartner 2023 Market Guide for Network Automation Platforms, 65% of enterprise network activities were still done manually in 2023. IT spend is increasing at many companies, but it’s typically going to technology, not operations. It’s a scale problem. Networks have gotten exponentially more complex, but adding more NetOps engineers (which only some companies can afford to do) increases their abilities linearly. Old, manual processes cannot scale to match the network.

In some cases, manual processes are simply no longer sufficient. Many of our customers report that network diagrams or site maps will go out of date within weeks or days. Creating accurate documentation is not possible without automated help. 

The cost of a mistake remains high. A report from 2023 calculated the cost of IT downtime to be $5,600 per minute and from $145,000 to $450,000 per hour, depending on company size. Another study found that the median cost of an IT outage with a high business impact was $7.75 million.

A culture shift towards automation

Updating old processes can help NetOps solve the challenges explained above, which translates to fewer outages, better network performance, and better security. Many of these processes can be automated, but this requires a culture shift for NetOps teams that are used to doing things manually.

Many network engineers are skeptical of automation because, in the past, it was mediocre at best. Developers needed both scripting knowledge and a great deal of networking experience. This required either a rare (and often expensive) network engineer who knew Python or a team of engineers and developers working together. Automation projects took a long time and could only solve problems where all the parameters stayed static. Because of this, the work often ended up being greater than the benefit.

But automation is valuable because it solves the core problem of scale. Every enterprise has an SME who knows how to fix almost any network issue - the knowledge is already there. What they need is a way to share that knowledge with whoever needs it whenever they need it across the entire network for similar problems. Recent developments have made low-code and no-code network automation possible. This helps solve the scale problem (the relevant SME can design a script and then any engineer can run it) and allows network experts without coding skills to build automations themselves, rather than working with developers. This avoids many of the historical problems with automation.

Automation helps NetOps in several ways, such as:

1) Speeding up troubleshooting by sharing knowledge via automation. This saves time for the more experienced engineers and frees them up to spend more time on significant issues or improving the business rather than putting out fires.

2) Preventing configuration drift with regular assessments. Checking router configurations, switch ports access, failover readiness, ACLs, and other configurations all help catch issues across the network before they can cause outages. NetOps should do this regularly, but in reality, they don't have time. But automated checks for all these configurations can be scheduled and run daily or even hourly.

3) Reducing human errors during network changes. Uptime Intelligence found that 45% of all outages have root cause in configuration and change management, and human error plays a role in up to 80% of all data center outages. Automated verifications can check the network before and after a change to make sure no mistakes that could impact critical applications slipped through. 

Ultimately, this means fewer network outages and lower IT costs for the business.

How does an organization make this cultural shift towards automating? Here are a few ways to encourage it. The NetOps team should think about how to share their experience across the organization for scale and consistency. Whenever possible, shift the heavy lifting of tasks from people to machines to encourage people to think about automating first. To get more engineers and management on board, show off the early results of successful automation projects.  

A final word on network automation

Automated network assessments tend to produce helpful results with the least work, so lean into them whenever possible. Finally, look for partners on other teams like Cloud Operations, Security Operations, or Network Tools Advisors to work with. Tasks that they need the networking team to do for them, like looking up the location and IP address for devices involved in security investigations, are usually good candidates for automation.    

Overall, a cultural shift towards automation can improve operational efficiency, reduce mean time to recovery, and reduce the risk of service delivery issues. It does this by scaling up NetOp’s processes without scaling up the staff. I’ve seen one large corporation save over 16,000 hours per year with network automation.

About the Author

Song Pang, SVP of Engineering, NetBrain

Song Pang is the SVP of Engineering at hybrid network automation and visibility company NetBrain, responsible for Pre-Sales, Professional Services, Technical Support, and Customer Success. He has been at NetBrain for almost ten years in a variety of customer support and engineering roles and formerly was an analyst at Stroud International. Pang has a B.S. in Electrical and Computer Engineering from Cornell University.

SUBSCRIBE TO OUR NEWSLETTER
Stay informed! Sign up to get expert advice and insight delivered direct to your inbox

You May Also Like


More Insights