Reputation: 1
I'm developing a logger daemon to squid to grab the logs on a mongodb database. But I'm experiencing too much cpu utilization. How can I optimize this code?
from sys import stdin
from pymongo import Connection
connection = Connection()
db = connection.squid
logs = db.logs
buffer = []
a = 'timestamp'
b = 'resp_time'
c = 'src_ip'
d = 'cache_status'
e = 'reply_size'
f = 'req_method'
g = 'req_url'
h = 'username'
i = 'dst_ip'
j = 'mime_type'
L = 'L'
while True:
l = stdin.readline()
if l[0] == L:
l = l[1:].split()
buffer.append({
a: float(l[0]),
b: int(l[1]),
c: l[2],
d: l[3],
e: int(l[4]),
f: l[5],
g: l[6],
h: l[7],
i: l[8],
j: l[9]
}
)
if len(buffer) == 1000:
logs.insert(buffer)
buffer = []
if not l:
break
connection.disconnect()
Upvotes: 0
Views: 257
Reputation: 48330
The cpu usage is given by that active loop While True. How many lines / minute do you have? put the
if len(buffer) == 1000:
logs.insert(buffer)
buffer = []
check after the buffer.append
I will tell you more after you tell me how many insertions you get so far
Upvotes: 0
Reputation: 168
This might be a better question for a python profiler. There's a few builtin Python profiling modules such as cProfile; you can read more about it here.
Upvotes: 1
Reputation: 34587
I'd suspect it might actually be readline() causing cpu utilization. Try running the same code with the readline replaced with just looking at some constant buffer provided by you. And try running with the database inserts commented out. Establish which one of these is the culprit.
Upvotes: 0