Tag: Automation

Cisco Nexus Switches Automation using Ansible

This is the success story of automating 700+ Cisco Nexus and IOS switches configuration changes in less than 4 hours. Even I never imagined this could be accomplished but by using Rundeck, Python and Ansible we made it successful.

There was a PCI vulnerability audit findings report shared with Network team manager which showed that close to 700+ switches configuration need to be tweaked to meet the PCI Compliance and Standards. So the network engineers identified and they sleeved up to apply required configuration changes manually. An estimation 3 days ETA was given by the team.

In the meantime network manager reached out to our team to check whether this process can be automated or not.

Ansible and Python’s Netmiko came to our rescue… we could automate all required configuration changes in a matter of few hours. Not to forget that we used SSHUTTLE to connect to all the switches spread across the globe from a single control node. Using Rundeck, we could push the Python scripts to worker nodes to execute the configuration commands using netmiko module.

Here is the simple yet powerful Ansible YAML configuration file which was used to implement the configuration changes on the production switches.

---
  - name: Playbook to remediate PCI Audit Findings on Cisco NXOS
    hosts: all
    gather_facts: no
    tasks:

      - name: Configure switch to disable services and console logging
        nxos_config:
          lines:
            - line console 
            - exec-timeout 10
            - line vty
            - "{{ tm }}"
            - no logging console
            - no logging monitor
            - logging logfile messages 6 size 16384
            - logging timestamp milliseconds
            - wr
            - end
          match: none
          save_when: always
        register: config
        
      - name: Check output
        debug:
          var: config   

nexus.yml – gather_facts used to fetch the device information. In this case we have disabled it to speed up the execution of playbook. There are two tasks in the above playbook 1. nxos_config and 2. debug.

nxos_config is the module developed by Core Ansible team which applies configuration changes on the nexus switches. lines – each line is a command which will be executed in configuration terminal mode. match – if match is set to none, the module will not attempt to compare the source configuration with the running configuration on the remote device. save_when is set to always to set the running config to startuo config. register is the keyword to save the output of the nxos_config to a variable called ‘config’. debug is a module used to display output or messages. Variable ‘config’ has the output of nxos_config which would be shown after playbook execution.

Ansible.cfg – Ansible configuration file.

[defaults]
inventory=inventory
log_path = ansible.log
ansible_debug=true
[persistent_connection]
log_messages = True
command_timeout=60
connect_retry_timeout = 60
[paramiko_connection]
host_key_auto_add = True
#auth_timeout = 300
#timeout = 300

inventory is the file having list of switch IP’s (or FQDN if switches are discoverable by DNS). log_path is the path of the log file to store all logs of the tasks being executed by the above playbook. ansible_debug set to true and its a best practice to enable this value for any network related automation. log_messages is to fetch the verbose logging info from the switches. command_timeout and connect_retry_timeout are mandatory to give more time to reach out to the remotely located devices. host_key_auto_add is set to true to automatically add the RSA keys to avoid prompting or failure of SSH connection. I’ve commented out auth_timeout and timeout but if you encounter delay or failure of logging due to network lag please uncomment them.

group_vars/nxos.yml – Group variables file contain credentials and other critical information. Please use ansible vault to encrypt this information

ansible_connection: local
ansible_network_os: nxos
ansible_user: <username&gt;
ansible_password: <password&gt;
tm: exec-timeout 10

It took about 3 hours to test the playbook. After successful test results, ran the playbook on prod switches which took about an hour to complete!! Later we randomly logged on to few switches to confirm the configuration changes made were successful or not. There were few switches which were failed to execute the commands in the playbook due to connectivity or credential errors.

SYNTAX:

ansible-playbook nexus.yml –syntax-check {Checks the YAML file syntax}

ansible-playbook nexus.yml -C {Dry run}

ansible-playbook nexus.yml {execute playbook}

Here are the screenshots – Output of ansible-playbook execution

Notice that changed is set to 1 and unreachable and failed is 0 indicating successful execution
Switches in red color failed to apply config changes due to credentials or connectivity issue

SSHUTTLE – Connect to various subnets from the jumphost to all the switches from the control node where Rundeck, Python scripts and Ansible are installed.

SYNTAX:

sshuttle -r <username>@<hostname or IP> <Subnet 1 IP> <Subnet 2 IP> <Subnet 3 IP> <Subnet n IP> -x <hostname or IP>

Notice that iptables rules are added automatically to enable SSH connectivity to switches

Here is the Python script used for Cisco IOS switches configuration automation along with Rundeck.

'''
This is Python script to amend changes to IOS XR switches as per PCI audit remediation

To run this script please make sure nexus switch is reachable via SSH port

'''
__author__ = "Vinay Umesh"
__copyright__ = "Copyright 2019, Virtustream, Dell Technologies."
__version__ = "1.0.0"
__maintainer__ = "Core Services Engineering"
__email__ = "vinay.umesh@virtustream.com"
__status__ = "Development"

from netmiko import ConnectHandler  # connect to cisco switches and execute cmds
from datetime import datetime  # Date and time module
import os  # Native OS operations and management
import logging  # Default Python logging module
import argparse  # Pass arguments
import getpass  # get password

# create a log file with system date and time stamp
logfile_ = datetime.now().strftime('nexus_switches_remediation_%H_%M_%d_%m_%Y.log')
date_ = datetime.now().strftime('%H_%M_%d_%m_%Y')


def check_arg(args=None):
    parser = argparse.ArgumentParser(description='Script to amend changes to  \
                                     Nexus switches as per PCI audit remediation')
    parser.add_argument('-s', '--source',
                        help='Source filename in CSV format required',
                        required='True')
    parser.add_argument('-u', '--user',
                        help='Username required', required='True')
    results = parser.parse_args(args)
    return (results.source, results.user)


src, user = check_arg()
# Get password
try:
    pwd = getpass.getpass()
except Exception as error:
    print('ERROR', error)

logger = logging.getLogger('IOSXR_PCI_Audit')
# set logging level
# logging.basicConfig(level=logging.INFO) # Python 2.x syntax
# toggle between DEBUG and INFO to see the difference
logger.setLevel(logging.DEBUG)
# logger.setLevel(logging.INFO)

# create file handler which logs even debug messages

# Create 'logs' folder if not exists. Change the path of logdir as your git folder

logdir = "logs/"
if not os.path.exists(logdir):
    os.makedirs(logdir)

fh = logging.FileHandler(logdir + logfile_)
fh.setLevel(logging.DEBUG)

# create formatter and add it to the handlers
formatter = logging.Formatter('%(asctime)s | %(name)s | %(levelname)s | %(message)s')
fh.setFormatter(formatter)

# add the handlers to the logger
logger.addHandler(fh)

logger.info('IOS XR switch PCI Audit remediation script started @ {}'.format(date_))

# Open the file having list of IPs of Cisco IOS switches
with open(src, 'r') as lines:
    logger.info('Read source file with device details')
    lines = list(lines)  # convert file object to list object
    del(lines[0])  # skip the header row
    for line in lines:
        value = line.split(',')
        dc = value[0]  # dc
        sw = value[1]  # switch name
        ip = value[2]  # ip
        print('Switch DC: {}'.format(dc))
        print('Switch Name: {}'.format(sw))
        print('Switch IP: {}'.format(ip))
        logger.info(' DC - {}  Switch Name - {} IP - {}'.format(dc, sw, ip))
        un = user
        pw = pwd

        #  Connecting to the switch
        try:
            # Default value is 2 else change to 4
            net_connect = ConnectHandler(device_type='cisco_ios', ip=ip, username=un, password=pw,
                                         global_delay_factor=2)
            #  show version of the switch
            ver = net_connect.send_command("show version")
            logger.info('Switch version details :\n {}'.format(ver))
            print('Show version command executed:\n{}'.format(ver))
            #  Change to config term mode
            net_connect.config_mode()
            #  Configuration config_commands
            config_commands = ['no service ipv4 tcp-small-servers',
                               'no service ipv4 udp-small-servers',
                               'no service ipv6 tcp-small-servers',
                               'no service ipv6 udp-small-servers',
                               'no http server',
                               'no tftp ipv4 server',
                               'no tftp ipv6 server',
                               'no dhcp ipv4',
                               'no dhcp ipv6',
                               'line console exec-timeout 10 0',
                               'logging console disable',
                               'logging monitor disable',
                               'no ipv4 source−route',
                               'end', 'copy running-config startup-config']
            #  Run config commands
            config = net_connect.send_config_set(config_commands)
            config += net_connect.send_command('\n', expect_string=r'#', delay_factor=2)
            logger.info('Switch Configuration Output :\n {}'.format(config))
            print('Config commands executed successfully:\n{}'.format(config))

            #  Show history of commands for IOS XR
            history2 = net_connect.send_command("show history")
            print('Show history for nexus command executed successfully: \n {}'.format(history2))
            logger.info('Switch-Nexus history output :\n {}'.format(history2))

            #  Exit from the switch
            net_connect.disconnect()

        except Exception as e:
            print('Error Occured while connecting to switch - {}  IP - {}: \n {}'.format(sw, ip, e))
            logger.info('Unable to connect to the switch - {} IP - {} :\n {}'.format(sw, ip, e))

SYNTAX:

python3 ios.py -s <inventory> -u <switch username>

Script prompts for password

We can use either Ansible or Python Netmiko way to automate Cisco switches configuration changes.

One of the SMEs of Network Engineering team reacted to this automation as under. I’m glad to see that folks are embracing and embarking towards AUTOMATION.

Hope this use case help to understand how to automate Cisco switches configuration and operational tasks. Please leave your feedback if you found this blog useful and share suggestions in the below Comments section.

Image Courtesy and References:

https://blogs.cisco.com/datacenter/ansible-support-for-ucs-and-nexus

https://docs.ansible.com/ansible/latest/modules/nxos_config_module.html

https://docs.ansible.com/ansible/2.3/ios_config_module.html

Automate Cisco switches config using NETMIKO

Netmiko is a simplified module to manage Network devices via SSH. This module is developed by Kirk Byers. I would recommend all network engineers to learn and equip this module to automate day to day configuration tasks.

netmiko_show_arp

I’ve written a demo script to disable SNMP Server’s “globalenforcepriv” on both Cisco IOS and Nexus switches. Using this script we could make the necessary changes on 100’s of switches in few minutes saving time and energies of network engineers.

Al Sweigart wrote a book called ‘Automate boring stuff using python’. This is one of the scripts which could help network engineers to automate monotonous tasks using Python and Netmiko module.

(more…)

EMC VMAX Storage Automated Performance Report

This blog is being written as a companion to my previous blog on Automated EMC VMAX Capacity Reporting

platform-vmax

In recent times, we’re asked to develop scripts to capture performance metrics from EMC VMAX storage. There is a ‘symstat’ command with many attributes to capture performance metrics information from the array. But this command was not fulfilling all our requirements. While exploring various options and consultation with EMC support / community we decided to try Unisphere / RESTAPI.

So far I was using Perl as THE LANGUAGE to talk to my storage arrays. But I was forced to switch over to Python which works best with REST API / JSON. Additionally, there are lots of code out there on RESTAPI written in Python. So it is easy to ‘get inspired’ by those codes and write customized code for our requirements. So this would make me yet another ‘Pythonistas’ 🙂

This is my first ever Python (version 2.7 on GNU/Debian Linux) script to capture EMC VMAX Performance Metrics retrieved from Unisphere for VMAX (version 8.2) via RESTAPI. I’ve referred this Python script to develop custom script to suit our requirements. Many thanks to Matt Cowger (mcowger) for sharing the script in Github.

There are plenty of metrics that can be captured using this script but I’ve written a simple code for demo purpose to print few metrics in CSV format which can be either imbibed by excel for further reporting / charting or injected to MySQL DB to do many stuffs…

Here is the sample + cropped output for reference. In the below table timestamp (column B) is in epoch format which is converted to MYSQL datetime format via INSERT query

2016-11-04-22_25_58-book1-excel

 

P.S: I’ve changed VMAX serial number for various factors🙂

If interested, please reach out to me to get these Python scripts.

Image Courtesy: https://www.emc.com

References: https://github.com/mcowger/randompython/blob/master/symmREST.py

Thanks for stopping by… Please leave your comments / suggestions.

EMC VMAX Storage Automated Capacity Report

EMC VMAX3

Storage capacity reporting has become a tedious task these days unless we don’t have tools in place. These tools are not as cheap we can think of. There are additional layers added to it to hype and increase its value in market. But if customer ask a storage admin to run the report manually then its a nightmare for the admins.

We always want to go by the easy way which means GUI / EMC Unisphere but for reporting this doesn’t helps to customize and fulfill the customer requirements . EMC SYMCLI / Solutions Enabler can be used effectively here to address this issue. We can use SYMCLI to automate these manual reports. By means of automation, we can ensure quality, timely and error-free reports are generated which can be scheduled via CRON or Tasks Scheduler to send it directly to the stakeholders. SYMCLI built-in supports XML formatted output of the commands which would become easier to parse the information using XML supported programming language.

I’ve developed PERL scripts to generate Storage Pool capacity and Disk capacity report for multiple VMAX storage arrays which are managed from the SYMCLI / Solutions Enabler server. Following are the benefits of using this script.
1. Generates Pool capacity report for all storage arrays which has columns – SYM ID,MODEL,TOTAL USABLE POOL CAPACITY TB,TOTAL USED POOL CAPACITY TB,TOTAL FREE POOL CAPACITY TB, TOTAL POOL UTILIZATION %, TOTAL POOL SUBSCRIBED %
2. Generates Disk capacity report for all storage arrays which has columns – SYM ID,MODEL,TOTAL EFD,TOTAL FC_SAS,TOTAL SATA_NLSAS,FORMATTED EFD CAPACITY GB,FORMATTED FC_SAS CAPACITY GB,FORMATTED SATA_NLSAS CAPACITY GB, TOTAL FORMATTED CAPACITY GB, UNFORMATTED EFD CAPACITY GB,UNFORMATTED FC_SAS CAPACITY GB,UNFORMATTED SATA_NLSAS CAPACITY GB,TOTAL UNCONFIGURED CAPACITY GB,TOTAL UNFORMATTED CAPACITY GB

Output generated from the scripts will be in the form of CSV format. This is information can be plugged into desired format and generate reports with Pivots, charts etc…

Please find below  sample output for reference.

Pool Capacity Report:

2016-02-17 21_21_57-Book1 - Excel

Disk Capacity Report:

2016-02-17 21_22_41-Book1 - Excel

P.S: I’ve removed VMAX serial number for various factors 🙂

If interested, please reach out to me to get these PERL scripts.

Image Courtesy: http://www.storagereview.com and http://www.emc.com