Cisco Nexus Switches Automation using Ansible

This is the success story of automating 700+ Cisco Nexus and IOS switches configuration changes in less than 4 hours. Even I never imagined this could be accomplished but by using Rundeck, Python and Ansible we made it successful.

There was a PCI vulnerability audit findings report shared with Network team manager which showed that close to 700+ switches configuration need to be tweaked to meet the PCI Compliance and Standards. So the network engineers identified and they sleeved up to apply required configuration changes manually. An estimation 3 days ETA was given by the team.

In the meantime network manager reached out to our team to check whether this process can be automated or not.

Ansible and Python’s Netmiko came to our rescue… we could automate all required configuration changes in a matter of few hours. Not to forget that we used SSHUTTLE to connect to all the switches spread across the globe from a single control node. Using Rundeck, we could push the Python scripts to worker nodes to execute the configuration commands using netmiko module.

Here is the simple yet powerful Ansible YAML configuration file which was used to implement the configuration changes on the production switches.

---
  - name: Playbook to remediate PCI Audit Findings on Cisco NXOS
    hosts: all
    gather_facts: no
    tasks:

      - name: Configure switch to disable services and console logging
        nxos_config:
          lines:
            - line console 
            - exec-timeout 10
            - line vty
            - "{{ tm }}"
            - no logging console
            - no logging monitor
            - logging logfile messages 6 size 16384
            - logging timestamp milliseconds
            - wr
            - end
          match: none
          save_when: always
        register: config
        
      - name: Check output
        debug:
          var: config   

nexus.yml – gather_facts used to fetch the device information. In this case we have disabled it to speed up the execution of playbook. There are two tasks in the above playbook 1. nxos_config and 2. debug.

nxos_config is the module developed by Core Ansible team which applies configuration changes on the nexus switches. lines – each line is a command which will be executed in configuration terminal mode. match – if match is set to none, the module will not attempt to compare the source configuration with the running configuration on the remote device. save_when is set to always to set the running config to startuo config. register is the keyword to save the output of the nxos_config to a variable called ‘config’. debug is a module used to display output or messages. Variable ‘config’ has the output of nxos_config which would be shown after playbook execution.

Ansible.cfg – Ansible configuration file.

[defaults]
inventory=inventory
log_path = ansible.log
ansible_debug=true
[persistent_connection]
log_messages = True
command_timeout=60
connect_retry_timeout = 60
[paramiko_connection]
host_key_auto_add = True
#auth_timeout = 300
#timeout = 300

inventory is the file having list of switch IP’s (or FQDN if switches are discoverable by DNS). log_path is the path of the log file to store all logs of the tasks being executed by the above playbook. ansible_debug set to true and its a best practice to enable this value for any network related automation. log_messages is to fetch the verbose logging info from the switches. command_timeout and connect_retry_timeout are mandatory to give more time to reach out to the remotely located devices. host_key_auto_add is set to true to automatically add the RSA keys to avoid prompting or failure of SSH connection. I’ve commented out auth_timeout and timeout but if you encounter delay or failure of logging due to network lag please uncomment them.

group_vars/nxos.yml – Group variables file contain credentials and other critical information. Please use ansible vault to encrypt this information

ansible_connection: local
ansible_network_os: nxos
ansible_user: <username>
ansible_password: <password>
tm: exec-timeout 10

It took about 3 hours to test the playbook. After successful test results, ran the playbook on prod switches which took about an hour to complete!! Later we randomly logged on to few switches to confirm the configuration changes made were successful or not. There were few switches which were failed to execute the commands in the playbook due to connectivity or credential errors.

SYNTAX:

ansible-playbook nexus.yml –syntax-check {Checks the YAML file syntax}

ansible-playbook nexus.yml -C {Dry run}

ansible-playbook nexus.yml {execute playbook}

Here are the screenshots – Output of ansible-playbook execution

Notice that changed is set to 1 and unreachable and failed is 0 indicating successful execution
Switches in red color failed to apply config changes due to credentials or connectivity issue

SSHUTTLE – Connect to various subnets from the jumphost to all the switches from the control node where Rundeck, Python scripts and Ansible are installed.

SYNTAX:

sshuttle -r <username>@<hostname or IP> <Subnet 1 IP> <Subnet 2 IP> <Subnet 3 IP> <Subnet n IP> -x <hostname or IP>

Notice that iptables rules are added automatically to enable SSH connectivity to switches

Here is the Python script used for Cisco IOS switches configuration automation along with Rundeck.

'''
This is Python script to amend changes to IOS XR switches as per PCI audit remediation

To run this script please make sure nexus switch is reachable via SSH port

'''
__author__ = "Vinay Umesh"
__copyright__ = "Copyright 2019, Virtustream, Dell Technologies."
__version__ = "1.0.0"
__maintainer__ = "Core Services Engineering"
__email__ = "vinay.umesh@virtustream.com"
__status__ = "Development"

from netmiko import ConnectHandler  # connect to cisco switches and execute cmds
from datetime import datetime  # Date and time module
import os  # Native OS operations and management
import logging  # Default Python logging module
import argparse  # Pass arguments
import getpass  # get password

# create a log file with system date and time stamp
logfile_ = datetime.now().strftime('nexus_switches_remediation_%H_%M_%d_%m_%Y.log')
date_ = datetime.now().strftime('%H_%M_%d_%m_%Y')


def check_arg(args=None):
    parser = argparse.ArgumentParser(description='Script to amend changes to  \
                                     Nexus switches as per PCI audit remediation')
    parser.add_argument('-s', '--source',
                        help='Source filename in CSV format required',
                        required='True')
    parser.add_argument('-u', '--user',
                        help='Username required', required='True')
    results = parser.parse_args(args)
    return (results.source, results.user)


src, user = check_arg()
# Get password
try:
    pwd = getpass.getpass()
except Exception as error:
    print('ERROR', error)

logger = logging.getLogger('IOSXR_PCI_Audit')
# set logging level
# logging.basicConfig(level=logging.INFO) # Python 2.x syntax
# toggle between DEBUG and INFO to see the difference
logger.setLevel(logging.DEBUG)
# logger.setLevel(logging.INFO)

# create file handler which logs even debug messages

# Create 'logs' folder if not exists. Change the path of logdir as your git folder

logdir = "logs/"
if not os.path.exists(logdir):
    os.makedirs(logdir)

fh = logging.FileHandler(logdir + logfile_)
fh.setLevel(logging.DEBUG)

# create formatter and add it to the handlers
formatter = logging.Formatter('%(asctime)s | %(name)s | %(levelname)s | %(message)s')
fh.setFormatter(formatter)

# add the handlers to the logger
logger.addHandler(fh)

logger.info('IOS XR switch PCI Audit remediation script started @ {}'.format(date_))

# Open the file having list of IPs of Cisco IOS switches
with open(src, 'r') as lines:
    logger.info('Read source file with device details')
    lines = list(lines)  # convert file object to list object
    del(lines[0])  # skip the header row
    for line in lines:
        value = line.split(',')
        dc = value[0]  # dc
        sw = value[1]  # switch name
        ip = value[2]  # ip
        print('Switch DC: {}'.format(dc))
        print('Switch Name: {}'.format(sw))
        print('Switch IP: {}'.format(ip))
        logger.info(' DC - {}  Switch Name - {} IP - {}'.format(dc, sw, ip))
        un = user
        pw = pwd

        #  Connecting to the switch
        try:
            # Default value is 2 else change to 4
            net_connect = ConnectHandler(device_type='cisco_ios', ip=ip, username=un, password=pw,
                                         global_delay_factor=2)
            #  show version of the switch
            ver = net_connect.send_command("show version")
            logger.info('Switch version details :\n {}'.format(ver))
            print('Show version command executed:\n{}'.format(ver))
            #  Change to config term mode
            net_connect.config_mode()
            #  Configuration config_commands
            config_commands = ['no service ipv4 tcp-small-servers',
                               'no service ipv4 udp-small-servers',
                               'no service ipv6 tcp-small-servers',
                               'no service ipv6 udp-small-servers',
                               'no http server',
                               'no tftp ipv4 server',
                               'no tftp ipv6 server',
                               'no dhcp ipv4',
                               'no dhcp ipv6',
                               'line console exec-timeout 10 0',
                               'logging console disable',
                               'logging monitor disable',
                               'no ipv4 source−route',
                               'end', 'copy running-config startup-config']
            #  Run config commands
            config = net_connect.send_config_set(config_commands)
            config += net_connect.send_command('\n', expect_string=r'#', delay_factor=2)
            logger.info('Switch Configuration Output :\n {}'.format(config))
            print('Config commands executed successfully:\n{}'.format(config))

            #  Show history of commands for IOS XR
            history2 = net_connect.send_command("show history")
            print('Show history for nexus command executed successfully: \n {}'.format(history2))
            logger.info('Switch-Nexus history output :\n {}'.format(history2))

            #  Exit from the switch
            net_connect.disconnect()

        except Exception as e:
            print('Error Occured while connecting to switch - {}  IP - {}: \n {}'.format(sw, ip, e))
            logger.info('Unable to connect to the switch - {} IP - {} :\n {}'.format(sw, ip, e))

SYNTAX:

python3 ios.py -s <inventory> -u <switch username>

Script prompts for password

We can use either Ansible or Python Netmiko way to automate Cisco switches configuration changes.

One of the SMEs of Network Engineering team reacted to this automation as under. I’m glad to see that folks are embracing and embarking towards AUTOMATION.

Hope this use case help to understand how to automate Cisco switches configuration and operational tasks. Please leave your feedback if you found this blog useful and share suggestions in the below Comments section.

Image Courtesy and References:

https://blogs.cisco.com/datacenter/ansible-support-for-ucs-and-nexus

https://docs.ansible.com/ansible/latest/modules/nxos_config_module.html

https://docs.ansible.com/ansible/2.3/ios_config_module.html

Advertisements

2 thoughts on “Cisco Nexus Switches Automation using Ansible

Leave a Reply to Jo Cancel reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s