Category: Rundeck

Cisco Nexus Switches Automation using Ansible

This is the success story of automating 700+ Cisco Nexus and IOS switches configuration changes in less than 4 hours. Even I never imagined this could be accomplished but by using Rundeck, Python and Ansible we made it successful.

There was a PCI vulnerability audit findings report shared with Network team manager which showed that close to 700+ switches configuration need to be tweaked to meet the PCI Compliance and Standards. So the network engineers identified and they sleeved up to apply required configuration changes manually. An estimation 3 days ETA was given by the team.

In the meantime network manager reached out to our team to check whether this process can be automated or not.

Ansible and Python’s Netmiko came to our rescue… we could automate all required configuration changes in a matter of few hours. Not to forget that we used SSHUTTLE to connect to all the switches spread across the globe from a single control node. Using Rundeck, we could push the Python scripts to worker nodes to execute the configuration commands using netmiko module.

Here is the simple yet powerful Ansible YAML configuration file which was used to implement the configuration changes on the production switches.

---
  - name: Playbook to remediate PCI Audit Findings on Cisco NXOS
    hosts: all
    gather_facts: no
    tasks:

      - name: Configure switch to disable services and console logging
        nxos_config:
          lines:
            - line console 
            - exec-timeout 10
            - line vty
            - "{{ tm }}"
            - no logging console
            - no logging monitor
            - logging logfile messages 6 size 16384
            - logging timestamp milliseconds
            - wr
            - end
          match: none
          save_when: always
        register: config
        
      - name: Check output
        debug:
          var: config   

nexus.yml – gather_facts used to fetch the device information. In this case we have disabled it to speed up the execution of playbook. There are two tasks in the above playbook 1. nxos_config and 2. debug.

nxos_config is the module developed by Core Ansible team which applies configuration changes on the nexus switches. lines – each line is a command which will be executed in configuration terminal mode. match – if match is set to none, the module will not attempt to compare the source configuration with the running configuration on the remote device. save_when is set to always to set the running config to startuo config. register is the keyword to save the output of the nxos_config to a variable called ‘config’. debug is a module used to display output or messages. Variable ‘config’ has the output of nxos_config which would be shown after playbook execution.

Ansible.cfg – Ansible configuration file.

[defaults]
inventory=inventory
log_path = ansible.log
ansible_debug=true
[persistent_connection]
log_messages = True
command_timeout=60
connect_retry_timeout = 60
[paramiko_connection]
host_key_auto_add = True
#auth_timeout = 300
#timeout = 300

inventory is the file having list of switch IP’s (or FQDN if switches are discoverable by DNS). log_path is the path of the log file to store all logs of the tasks being executed by the above playbook. ansible_debug set to true and its a best practice to enable this value for any network related automation. log_messages is to fetch the verbose logging info from the switches. command_timeout and connect_retry_timeout are mandatory to give more time to reach out to the remotely located devices. host_key_auto_add is set to true to automatically add the RSA keys to avoid prompting or failure of SSH connection. I’ve commented out auth_timeout and timeout but if you encounter delay or failure of logging due to network lag please uncomment them.

group_vars/nxos.yml – Group variables file contain credentials and other critical information. Please use ansible vault to encrypt this information

ansible_connection: local
ansible_network_os: nxos
ansible_user: <username>
ansible_password: <password>
tm: exec-timeout 10

It took about 3 hours to test the playbook. After successful test results, ran the playbook on prod switches which took about an hour to complete!! Later we randomly logged on to few switches to confirm the configuration changes made were successful or not. There were few switches which were failed to execute the commands in the playbook due to connectivity or credential errors.

SYNTAX:

ansible-playbook nexus.yml –syntax-check {Checks the YAML file syntax}

ansible-playbook nexus.yml -C {Dry run}

ansible-playbook nexus.yml {execute playbook}

Here are the screenshots – Output of ansible-playbook execution

Notice that changed is set to 1 and unreachable and failed is 0 indicating successful execution
Switches in red color failed to apply config changes due to credentials or connectivity issue

SSHUTTLE – Connect to various subnets from the jumphost to all the switches from the control node where Rundeck, Python scripts and Ansible are installed.

SYNTAX:

sshuttle -r <username>@<hostname or IP> <Subnet 1 IP> <Subnet 2 IP> <Subnet 3 IP> <Subnet n IP> -x <hostname or IP>

Notice that iptables rules are added automatically to enable SSH connectivity to switches

Here is the Python script used for Cisco IOS switches configuration automation along with Rundeck.

'''
This is Python script to amend changes to IOS XR switches as per PCI audit remediation

To run this script please make sure nexus switch is reachable via SSH port

'''
__author__ = "Vinay Umesh"
__copyright__ = "Copyright 2019, Virtustream, Dell Technologies."
__version__ = "1.0.0"
__maintainer__ = "Core Services Engineering"
__email__ = "vinay.umesh@virtustream.com"
__status__ = "Development"

from netmiko import ConnectHandler  # connect to cisco switches and execute cmds
from datetime import datetime  # Date and time module
import os  # Native OS operations and management
import logging  # Default Python logging module
import argparse  # Pass arguments
import getpass  # get password

# create a log file with system date and time stamp
logfile_ = datetime.now().strftime('nexus_switches_remediation_%H_%M_%d_%m_%Y.log')
date_ = datetime.now().strftime('%H_%M_%d_%m_%Y')


def check_arg(args=None):
    parser = argparse.ArgumentParser(description='Script to amend changes to  \
                                     Nexus switches as per PCI audit remediation')
    parser.add_argument('-s', '--source',
                        help='Source filename in CSV format required',
                        required='True')
    parser.add_argument('-u', '--user',
                        help='Username required', required='True')
    results = parser.parse_args(args)
    return (results.source, results.user)


src, user = check_arg()
# Get password
try:
    pwd = getpass.getpass()
except Exception as error:
    print('ERROR', error)

logger = logging.getLogger('IOSXR_PCI_Audit')
# set logging level
# logging.basicConfig(level=logging.INFO) # Python 2.x syntax
# toggle between DEBUG and INFO to see the difference
logger.setLevel(logging.DEBUG)
# logger.setLevel(logging.INFO)

# create file handler which logs even debug messages

# Create 'logs' folder if not exists. Change the path of logdir as your git folder

logdir = "logs/"
if not os.path.exists(logdir):
    os.makedirs(logdir)

fh = logging.FileHandler(logdir + logfile_)
fh.setLevel(logging.DEBUG)

# create formatter and add it to the handlers
formatter = logging.Formatter('%(asctime)s | %(name)s | %(levelname)s | %(message)s')
fh.setFormatter(formatter)

# add the handlers to the logger
logger.addHandler(fh)

logger.info('IOS XR switch PCI Audit remediation script started @ {}'.format(date_))

# Open the file having list of IPs of Cisco IOS switches
with open(src, 'r') as lines:
    logger.info('Read source file with device details')
    lines = list(lines)  # convert file object to list object
    del(lines[0])  # skip the header row
    for line in lines:
        value = line.split(',')
        dc = value[0]  # dc
        sw = value[1]  # switch name
        ip = value[2]  # ip
        print('Switch DC: {}'.format(dc))
        print('Switch Name: {}'.format(sw))
        print('Switch IP: {}'.format(ip))
        logger.info(' DC - {}  Switch Name - {} IP - {}'.format(dc, sw, ip))
        un = user
        pw = pwd

        #  Connecting to the switch
        try:
            # Default value is 2 else change to 4
            net_connect = ConnectHandler(device_type='cisco_ios', ip=ip, username=un, password=pw,
                                         global_delay_factor=2)
            #  show version of the switch
            ver = net_connect.send_command("show version")
            logger.info('Switch version details :\n {}'.format(ver))
            print('Show version command executed:\n{}'.format(ver))
            #  Change to config term mode
            net_connect.config_mode()
            #  Configuration config_commands
            config_commands = ['no service ipv4 tcp-small-servers',
                               'no service ipv4 udp-small-servers',
                               'no service ipv6 tcp-small-servers',
                               'no service ipv6 udp-small-servers',
                               'no http server',
                               'no tftp ipv4 server',
                               'no tftp ipv6 server',
                               'no dhcp ipv4',
                               'no dhcp ipv6',
                               'line console exec-timeout 10 0',
                               'logging console disable',
                               'logging monitor disable',
                               'no ipv4 source−route',
                               'end', 'copy running-config startup-config']
            #  Run config commands
            config = net_connect.send_config_set(config_commands)
            config += net_connect.send_command('\n', expect_string=r'#', delay_factor=2)
            logger.info('Switch Configuration Output :\n {}'.format(config))
            print('Config commands executed successfully:\n{}'.format(config))

            #  Show history of commands for IOS XR
            history2 = net_connect.send_command("show history")
            print('Show history for nexus command executed successfully: \n {}'.format(history2))
            logger.info('Switch-Nexus history output :\n {}'.format(history2))

            #  Exit from the switch
            net_connect.disconnect()

        except Exception as e:
            print('Error Occured while connecting to switch - {}  IP - {}: \n {}'.format(sw, ip, e))
            logger.info('Unable to connect to the switch - {} IP - {} :\n {}'.format(sw, ip, e))

SYNTAX:

python3 ios.py -s <inventory> -u <switch username>

Script prompts for password

We can use either Ansible or Python Netmiko way to automate Cisco switches configuration changes.

One of the SMEs of Network Engineering team reacted to this automation as under. I’m glad to see that folks are embracing and embarking towards AUTOMATION.

Hope this use case help to understand how to automate Cisco switches configuration and operational tasks. Please leave your feedback if you found this blog useful and share suggestions in the below Comments section.

Image Courtesy and References:

https://blogs.cisco.com/datacenter/ansible-support-for-ucs-and-nexus

https://docs.ansible.com/ansible/latest/modules/nxos_config_module.html

https://docs.ansible.com/ansible/2.3/ios_config_module.html

Advertisements

Integrate Rundeck notifications with Slack

There are many plugins available for Rundeck to integrate with Slack. In this blog, I’ve explained in simple steps to configure Rundeck Job Notifications with Slack. This blog is useful for the use-cases like audit, monitor and maintain logs of Rundeck job executions.

I spent many hours of searching and fixing unexpected errors/issues in different version of Rundeck, which are all covered and documented as simple steps in this blog.

I’m using Higanworks’s plugin downloadable from GitHub

Advantages:

  • Single window to view all notifications from Rundeck {better than pile of emails}
  • Multiple users/groups can be notified by adding them to the notifications channel {no need to mess with distribution lists etc}
  • Logging and Auditing now made easier by using the powerful search options available in Slack

Requirements:

  • Rundeck 2.10.x or above {Running on CentOS 7}
  • openjdk version “1.8.0_171”
    OpenJDK Runtime Environment (build 1.8.0_171-b10)
    OpenJDK 64-Bit Server VM (build 25.171-b10, mixed mode)
  • Rundeck-slack-incoming-webhook-plugin v.0.6.dev or above
  • Working Slack user account,
  • Dedicated channel for Rundeck notifications with webhook app enabled

Downloads:

Pre-Installation Steps

  • Create a new private channel in slack {e.g: rundeck_notifications}
  • Webhook URL for the newly created channel. Refer Slack guide

Install and Configuration Steps

By now server is ready with installation of Rundeck and plugin downloaded from the Github. Make sure Rundeck server has internet access to connect and send messages to slack.

  • Copy rundeck-slack-incoming-webhook-plugin-x.y.z.jar executable file to Rundeck’s libext directory {/var/lib/rundeck/libext}

2018-06-17 11_11_23-Mint [Running] - Oracle VM VirtualBox

  • After the above file is placed in the libext directory, Rundeck automatically configure the plugin and no further user actions required. No need to restart Rundeck service. Please refer Rundeck Plugins Installation Guide for further details.
  • Below screenshot show new option ‘Slack Incoming Webhook’ available while Creating/Editing jobs in Rundeck. Paste here the webhook generated for new rundeck_notifications channel

configuration

  • Sample output for reference. Slack channel rundeck_notifications showing notifications generated by Rundeck Job executions

2018_06_17_11_24_31_Slack_Rubicon.png

References:

https://github.com/higanworks/rundeck-slack-incoming-webhook-plugin

http://rundeck.org/docs/plugins-user-guide/installing.html

https://api.slack.com/incoming-webhooks

Image Courtesy:

https://github.com/higanworks/rundeck-slack-incoming-webhook-plugin

If you are facing any problem let me know by using ‘Comments’ section below and I will try my best to help you.

Rundeck SSL Configuration

Rundeck is an OSS that automate routine operational procedures in data center or cloud environments. Here is the blog to configure SSL on Rundeck for secure transaction within intranet and internet. This blog is a reference to configure SSL for Rundeck running on Linux (CentOS/Debian)

Phase 1:

Steps to generate self-signed PKCS#12 SSL certificate and export its keys:

  • Create PKCS#12 keystore (.pfx file)
#keytool -genkeypair -keystore myKeystore.pfx -storetype PKCS12 -storepass password -alias KEYSTORE_ENTRY -keyalg RSA -keysize 2048 -validity 99999 -dname "CN=My SSL Certificate, OU=Sustaining, O=Virtustream, L=McLean, ST=VA, C=US" -ext san=dns:servername.com,dns:localhost,ip:127.0.0.1,ip:xx.xx.xx.xx

Replace servername.com with FQDN of the Rundeck server and xx with Rundeck server IP address (more…)

Setup Active Directory Authentication for Rundeck

Rundeck is a simple & easy product to setup workflow and automation tool. By default, it comes with the default local user accounts. Rundeck supports LDAP, AD, PAM and Pre-Auth methods. But the downside is Rundeck’s documentation which is not that great to configure LDAP/AD based authentication.

After multiple attempts and spending a whole day searching on the internet; able to configure AD authentication… Here are the simple steps for Rundeck AD auth configuration.

(more…)

Add a remote node in Rundeck

To add a remote node in Rundeck, we need to have SSH connectivity (usually port # 22) and to setup a SSH based key based authentication between Rundeck server & client. Please click here and follow the guide to setup key based authentication as a first step.

After setting up key based auth, test the SSH connectivity. Copy the file id_rsa from your home directory to the below path

# cp/home/<USERNAME>/.ssh/id_rsa  /var/lib/rundeck/.ssh/id_rsa

Third step is to add node details in the resources.xml file

Path – /var/rundeck/projects/VEC-Storage/etc/resources.xml and add entries as shown below. By default, Rundeck server information which is a first node definition would be there already. (open with your favorite {nano} editor)

<?xml version=”1.0″ encoding=”UTF-8″?>

<project>
<node name=”SAS” description=”Rundeck server node” tags=”RDS” hostname=”<name/IP>” osArch=”amd64″ osFamily=”unix” osName=”Linux” osVersion=”4.9.0-2-amd64″ username=”rundeck”/>
<node name=”Name” description=”Windows Jump2″ tags=”JMP” hostname=”<name/IP>” osFamily=”windows” username=”user” ssh-keypath=”/var/lib/rundeck/.ssh/id_rsa”/>

<node name=”Name” description=”SYMCLI SRV” tags=”SYM” hostname=”<name/IP>” osArch=”amd64″ osFamily=”unix” osName=”Linux” username=”user”/>

<node name=”Name” description=”SYMCLI SRV” tags=”SYM” hostname=”<name/IP>” osArch=”amd64″ osFamily=”unix” osName=”Linux” username=”user”/>

<node name=”Name” description=”SYMCLI SRV” tags=”SYM” hostname=”<name/IP>” osArch=”amd64″ osFamily=”unix” osName=”Linux” username=”user”/>

<node name=”Name” description=”SYMCLI SRV” tags=”SYM” hostname=”<name/IP>” osArch=”amd64″ osFamily=”unix” osName=”Linux” username=”user”/>

<node name=”Name” description=”SYMCLI SRV” tags=”SYM” hostname=”<name/IP>” osArch=”amd64″ osFamily=”unix” osName=”Linux” username=”user”/>

<node name=”Name” description=”Windows Jump1″ tags=”JMP1″ hostname=”<name/IP>” osFamily=”windows” username=”user” ssh-keypath=”/var/lib/rundeck/.ssh/id_rsa”/>

<node name=”Name” description=”SYMCLI SRV” tags=”SYM” hostname=”<name/IP>” osArch=”amd64″ osFamily=”unix” osName=”Linux” username=”user”/>
</project>

There are multiple nodes definitions of both OS types (both *nix & Windows) added to Rundeck. Tags come in handy while defining the node attributes which helps in grouping similar type of clients. As an example shown above, tags=’SYM’ represent that the client is a EMC VMAX Mgmt. Host running with SYMCLI & Unisphere for VMAX services.

To run a command across all nodes, simply type in the tag name and run the command from Rundeck >Menu Bar > Commands. PFB Screenshots for reference.

2017-12-12 21_29_03-Commands - VEC-Storage

Output of above command from all the nodes are as under.

2017-12-12 21_53_06-192.168.60.63_4440_project_VEC-Storage_execution_downloadOutput_11924_view=inlin

Troubleshooting: In case of SSH connectivity issue, edit the sudoers file (visudo) and remove the comment of below line. But before making changes, consult your administrator whether this incline with standard policies.

Defaults !requiretty