Reputation: 1061
I wrote a scripts that logs mac addresses from pcapy into mysql through SQLAlchemy, I initially used straight sqlite3 but soon realized that something better was required, so this weekend that past I rewrote all the database talk to comply with SQLAlchemy. All works fine, data goes in and comes out again. I though the sessionmaker() would be very useful to manage all the sessions to the DB for me.
I see a strange occurrence with regards to memory consumption. I start the script... it collects and writes all to DB... but for every 2-4seconds I have a Megabyte in size increase in memory consumption. At the moment I'm talking about very few records, sub-100 rows.
Script Sequence:
if true? only write timestamp to timestamp column where mac = newmac. back to Step 2.
if false? then write new mac to DB. clear maclist[] and call step 2 again.
After 1h30m I have a memory footprint of 1027MB (RES) and 1198MB (VIRT) with 124 rows in the 1 table database (MySQL).
Q: Could this be contributed to the maclist[] being cleaned and repopulated from DB everytime?
Q: Whats going to happen when it reaches system Max memory?
Any ideas or advice would be great thanks.
memory_profiler output for the segment in question where list[] gets populated from database mac_addr column.
Line # Mem usage Increment Line Contents
================================================
123 1025.434 MiB 0.000 MiB @profile
124 def sniffmgmt(p):
125 global __mac_reel
126 global _blacklist
127 1025.434 MiB 0.000 MiB stamgmtstypes = (0, 2, 4)
128 1025.434 MiB 0.000 MiB tmplist = []
129 1025.434 MiB 0.000 MiB matching = []
130 1025.434 MiB 0.000 MiB observedclients = []
131 1025.434 MiB 0.000 MiB tmplist = populate_observed_list()
132 1025.477 MiB 0.043 MiB for i in tmplist:
133 1025.477 MiB 0.000 MiB observedclients.append(i[0])
134 1025.477 MiB 0.000 MiB _mac_address = str(p.addr2)
135 1025.477 MiB 0.000 MiB if p.haslayer(Dot11):
136 1025.477 MiB 0.000 MiB if p.type == 0 and p.subtype in stamgmtstypes:
137 1024.309 MiB -1.168 MiB _timestamp = atimer()
138 1024.309 MiB 0.000 MiB if p.info == "":
139 1021.520 MiB -2.789 MiB _SSID = "hidden"
140 else:
141 1024.309 MiB 2.789 MiB _SSID = p.info
142
143 1024.309 MiB 0.000 MiB if p.addr2 not in observedclients:
144 1018.184 MiB -6.125 MiB db_add(_mac_address, _timestamp, _SSID)
145 1018.184 MiB 0.000 MiB greetings()
146 else:
147 1024.309 MiB 6.125 MiB add_time(_mac_address, _timestamp)
148 1024.309 MiB 0.000 MiB observedclients = [] #clear the list
149 1024.309 MiB 0.000 MiB observedclients = populate_observed_list() #repopulate the list
150 1024.309 MiB 0.000 MiB greetings()
You will see observedclients is the list in question.
Upvotes: 9
Views: 2764
Reputation: 1061
I managed to find the actual cause to the memory consumption. It was scapy itself. Scapy by default is set to store all packets it captures. But you can disable it.
Disable:
sniff(iface=interface, prn=sniffmgmt, store=0)
Enable:
sniff(iface=interface, prn=sniffmgmt, store=1)
Thanks to BitBucket Ticket
Upvotes: 18
Reputation: 1061
Thanks for the guidance everyone. I think I managed to resolve the increasing memory consumption.
A: Code logic plays a very big role in memory consumption as I have learnt. If you look at the memory_profiler output in my initial question, I moved lines 131-133 into the IF statement at line 136. This seems to not increase the memory so frequently. I now need to refine that populate_observedlist() a bit more to not waste so much memory.
Upvotes: 0
Reputation: 913
As you can see profiler output suggests you use less memory by the end, so this is not representative of your situation.
Some directions to dig deeper: 1) add_time (why is it increasing memory usage?) 2) db_add (why is it decreasing memory usage? caching? closing/opening db connection? what happens in case of failure?) 3) populate_observed_list (is return value safe for garbage collection? may be there are some packets for which certain exception occurs?)
Also, what happens if you sniff more packets than your code is able to process do to performance?
I would profile these 3 functions and analyze possible exceptions/failures.
Upvotes: 1
Reputation: 15537
Very hard to say anything without the code, assuming it's not a leak in SQLAlchemy or scapy rather than in your code (seems unlikely).
You seem to have an idea of where the leak might happen, do some memory profiling to see if you were right.
Once your python process eats enough memory, you will probably get a MemoryError
exception.
Upvotes: 0