Mass network connection tracker

At times, you want to keep track of network connections between multiple hosts on different TCP ports.

For example, if you have a medium or large scale infra comprising of many applications, databases, devices etc. All these mutually connect with each other on TCP ports. And in case there is an unannounced change in routing or firewall rule, it is very difficult to track connection break points.

To track connections within applications, databases, devices, I have written a simple connection tracker along with a small agent. Agent can be compiled using pyinstaller and deployed on each source node from where you want to check connections to their target hosts. The main tracker code is executed on one master host, which then reads “source nodes” from a json config, does login on “source nodes” and executes agent to check connection on their target hosts:ports.

Probably, this config.json explains it better:


{
   "maxhosts": 5,
   "credsets": {
                 "manish": { "username": "manish", "password": "xxxxx" },
                 "cred1": { "username": "ubuntu", "password": "123456" },
                 "cred2": {"username": "ubuntu", "password": "abcd1234"}
             },
   "nodes": {
               "10.91.142.21": {"type": "linux", "credset": "cred1", "targets": ["10.91.142.21:22", "10.91.118.10:80", "192.168.1.19:2770"]},
               "10.91.118.31": {"type": "linux", "credset": "cred2", "targets": ["10.91.142.21:8443", "10.91.142.21:22", "192.168.1.19:2770"]}
            },
   "agent": "/opt/agent/pyconnagent",
   "logging": {
                "logfile": "/tmp/trace.logs",
                "report": "/tmp/report.csv"
              }
}

To speed up execution, you can set “maxhosts” value to trigger agent on multiple hosts in one go and it stores report and trace logs (ssh messages) in “report” and “logfile” respectively as mentioned in config.json.

Here is the agent code:


#!/usr/local/bin/python3
import socket
import sys

if len(sys.argv)-1 == 3:
    host, port, timeout = sys.argv[1], int(sys.argv[2]), int(sys.argv[3])
elif len(sys.argv)-1 == 2:
    host, port = sys.argv[1], int(sys.argv[2])
    timeout = 10
else:
    print("usage error: expected HOST/IP PORT Timeout")
    exit(0)

s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
s.settimeout(timeout)
try:
     s.connect((host, port))
     print("Connection works")
     s.close()
except Exception as e:
     print("Connection failed", e)

Not every “source node” may have python installed, so you should compile above agent to a single executable binary.

pyinstaller pyconnagent.py --onefile

Above command generates binary, pyconnagent, in “dist” directory which you have to copy to all source nodes at the path mentioned in config.json (this is one time pain).

Here is the main code which logins to each “source node”, executes agent on them and collects back each connection status in a report.


#!/usr/local/bin/python3
from netmiko import ConnectHandler
from multiprocessing import Pool
import os
import json
from time import strftime, localtime
import sys

try:
   params = sys.argv[1]
   param, value = params.split('=')
   if param != "--play":
      sys.exit()
   playconf = value
except:
   print("Usage: pymgt.py --play=<json config>")
   sys.exit()

with open(playconf) as cfg:
  cfgdata = json.load(cfg)

def CommitLogs(LogMessage):
    logfile = cfgdata.get('logging').get('logfile')
    try:
         fopen = open(logfile, "a")
         try:
            fopen.write(LogMessage+"\n")
            fopen.close()
         except:
            print("Failed to write ",LogMessage)
         return
    except:
         print("failed to open file", logfile)
    return

def ReportLog(CSVMsg):
    logfile = cfgdata.get('logging').get('report')
    try:
         fopen = open(logfile, "a")
         try:
            fopen.write(CSVMsg+"\n")
            fopen.close()
         except:
            print("Failed to write ",CSVMsg)
         return
    except:
         print("failed to open file", logfile)
    return

def checkconnFromNodes(device, targets):
   try:
      net_connect = ConnectHandler(**device)
      try:
           LogMessage = strftime("%Y-%m-%d %H:%M:%S", localtime())+" "+device.get('host')+" Logged in..."
           print(LogMessage)
           CommitLogs(LogMessage)
           if device.get('secret') is not None:
               net_connect.enable()
               LogMessage = strftime("%Y-%m-%d %H:%M:%S", localtime())+" "+device.get('host')+" Enabled Elevated access...."
               print(LogMessage)
               CommitLogs(LogMessage)
           sshstatus = 0
      except:
           LogMessage = strftime("%Y-%m-%d %H:%M:%S", localtime())+" "+device.get('host')+";Cannot gain elevated access...."
           print(LogMessage)
           CommitLogs(LogMessage)
           sshstatus = -1

   except:
      LogMessage = strftime("%Y-%m-%d %H:%M:%S", localtime())+" "+device.get('host')+";SSH failed as "+device.get('username')
      print(LogMessage)
      CommitLogs(LogMessage)
      sshstatus = -1

   if (sshstatus == 0):
      ncCmd = cfgdata.get('agent')
      for target in targets:
        desthost, destport = target.split(":")
        cmd = "{} {} {}".format(ncCmd, desthost, destport)
        print(desthost, destport, cmd)
        try:
          status = net_connect.send_command(cmd)
          CSVMsg = "{};{};{};{}".format(device.get('host'), desthost, destport, status.replace('\n',' '))
          print(CSVMsg)
          ReportLog(CSVMsg)
        except:
          CSVMsg = "{};{};{};{}".format(device.get('host'), desthost, destport, "Command Failed")
          print(CSVMsg)
          ReportLog(CSVMsg)
   return

def hostcmd(host):
  hostdata = cfgdata.get('nodes').get(host)
  type = hostdata.get('type')
  targets = hostdata.get('targets')
  port = hostdata.get('port')
  if port is None:
    port = 22
  credset = hostdata.get('credset')

  credparams = cfgdata.get('credsets').get(credset).keys()
  username = cfgdata.get('credsets').get(credset).get("username")
  password = cfgdata.get('credsets').get(credset).get("password")
  if "enpass" in credparams:
     enpass = cfgdata.get('credsets').get(credset).get("enpass")
  else:
     enpass = None

#  print("Enpass", host, enpass)
  if enpass is None:
      device = {
         'device_type': type,
         'host': host,
         'port': port,
         'username': username,
         'password': password,
      }
  else:
      device = {
         'device_type': type,
         'host': host,
         'port': port,
         'username': username,
         'password': password,
         'secret': enpass,
      }
#  print(device)
  checkconnFromNodes(device, targets)

hostlist = list(cfgdata.get('nodes').keys())
maxhosts = cfgdata.get('maxhosts')
with Pool(maxhosts) as p:
    p.map(hostcmd, hostlist)

Lets give it a run

./pyconn.py --play=config.json

Report shows source host, destination ip/host, destination port, connection message.


cat /tmp/report.csv
10.91.118.31;10.91.142.21;8443;Connection failed [Errno 111] Connection refused
10.91.142.21;10.91.142.21;22;Connection works
10.91.118.31;10.91.142.21;22;Connection works
10.91.142.21;10.91.118.10;80;Connection works
10.91.118.31;192.168.1.19;2770;Connection failed [Errno 111] Connection refused
10.91.142.21;192.168.1.19;2770;Connection failed [Errno 111] Connection refused

Trace log:


cat /tmp/trace.logs
2020-06-08 18:54:29 192.168.0.18;SSH failed as manish
2020-06-08 18:54:33 10.91.118.31 Logged in...
2020-06-08 18:54:34 10.91.142.21 Logged in...