dmx
dmx

Reputation: 1990

why is my Python program slowing down my computer

I have a small program in python that collect probe requests and send ssids and macs to a server. But it slows down my computer after few minutes. I have tried to add dict so that I will make a POST only when it is necessary. But the problem still the same: after few minutes, my computer is slowing down. I have tried also with raspberry pi and the result is the same.

Please tell me what is wrong here.

this is the code

#!/usr/bin/env python
from scapy.all import *
import json
import requests


macSsid = {}
def handler(pkt):
   url = 'http://10.10.10.10:3000/ssids'
   headers = {'content-type': 'application/json'}
   if pkt.haslayer(Dot11):
      if pkt.type == 0 and pkt.subtype == 4:
         if pkt.info :
            print "Client MAC = %s probe request =%s" % (pkt.addr2, pkt.info)
            if pkt.addr2 not in  macSsid:
               macSsid[pkt.addr2] = []
               macSsid[pkt.addr2].append(pkt.info)
               r = requests.post(url, data = json.dumps({"mac": pkt.addr2, "ssid": pkt.info }), headers = headers)
            else:
               if pkt.info not in  macSsid[pkt.addr2]:
                  macSsid[pkt.addr2].append(pkt.info)
                  r = requests.post(url, data = json.dumps({"mac": pkt.addr2, "ssid": pkt.info }), headers = headers)


while 1:
    try:
       exc_info = sys.exc_info()
       sniff(iface="mon0", prn = handler)
    except Exception, err:
       traceback.print_exception(*exc_info)  

Please tell me what is wrong here.

Upvotes: 0

Views: 4158

Answers (2)

Matt Jordan
Matt Jordan

Reputation: 2181

I believe it is caused by this line:

   if pkt.info not in  macSsid[pkt.addr2]:

As the list referenced by macSsid[pkt.addr2] gets larger, the cost of doing an iterative search on a python list becomes very high. This is a list, not a dictionary, so there is no index, and python has to check each element in order until either the matching element is found, or it has checked every element in the list.

I recommend changing it to a dictionary for faster operation,

   macSsid[pkt.addr2] = {}
   macSsid[pkt.addr2][pkt.info] = 1   # the 1 is just a placeholder,
     # so if you want to store e.g. a repeat count, this would allow it

And the corresponding change when adding pkt.info in the else clause.

Changing to an indexed type will allow index-based lookups, rather than iterating the list. I believe there are indexed-list types in Python, but I don't use them much, since the extra storage for metrics or similar data has always been a benefit in addition to the indexed lookups.

Your use-case is probably exaggerating the effect as well, since I suspect you will have a high probability of not finding a match for the pkt.info, and therefore each time it searches the list, it will have to check every element before concluding that it doesn't exist.

Consider the cost of adding the first 1000 elements to a list: each new element requires searching the previous set before adding, requiring half a million comparisons on those new elements alone, not including any repeated elements that you don't add. Elements that already exist could require checking more than the statistical mean if the network packets use sequential identifiers rather than completely-random identifiers, which is the case with most network protocols.

Note also that this will seem to slow down suddenly at the point when the list will no longer fit entirely in CPU cache. At that point, your performance will usually drop suddenly by anywhere from 10x to 100x, depending on the specs of your machine.

The reason this appears to affect the rest of your computer is because the CPU is blocked while waiting for each memory fetch operation to complete, and the constant memory fetch operations slow down your memory controller. The core that is running your python program will be blocked entirely, and your memory bus will be nearly saturated, causing other applications to seem slow as well.

Upvotes: 1

pta2002
pta2002

Reputation: 185

Your program is using too much memory. This is because you aren't clearing your macSsid dictionary. If you delete the entries after you use them (using del), your program will use less memory and thus stop slowing down your computer.

Upvotes: 0

Related Questions