Reputation: 307
i'm developing client for scientific measurement device that connected to PC by 1Gb ethernet.
Test PC CPU is i5-460M (2.53x2) + 8Gb ram. OS Win 7 x64 (can't be changed to linux). Python 2.7.6 x86
Device sends data in UDP packets with following format:
uint meas_id;
uint part_id;
ubyte data[1428];
Data rate is 1Gb/s (around 70'000 packets per second).
I need to recieve and dump data on disk (for around 10 minutes) for future processing, but faced two problems: packets drop (while transfering data between threads) and HDD usage.
Current structure is two working processes:
Using raw python socket i can receive around 110k pps on my machine without packet drops, just with
s = socket.socket(socket.AF_INET, socket.SOCK_DGRAM)
s.setsockopt(socket.SOL_SOCKET, socket.SO_RCVBUF, 1024*1024*256) # real buffer is less
s.bind(("0.0.0.0", 8201))
while is_active:
...
data = s.recv(1536)
But some packets become dropped. when i'm trying to send data to another process using code like this:
data_buf = []
while 1:
d = s.recv(1536)
data_buf.append(d)
if len(data_buf) == CHUNK_SIZE:
xchg_queue.put(data_buf)
data_buf = []
Pipe is faster, but as i can see - pipe.send() may lock if there is some objects in pipe.
Is there faster ways to send data between processes ?
I've tryed MySQL as storage with disabled indexes and enabled delayed write but got around 30-35k packets per second saving rate.
With cPickle a got 40-50k pps when saving 1000 - 100000 packets per file.
Is there is much more fast way to save data ? May be PyTables(HDF5) or some fast NoSQL DB (redis-like).
Also i'm not sure that this client is possible in python - may be it's necessary to rewrite module in pure C.
Or may be there is fast wrapper on python sockets (like gevent) ?
Hope you will help.
Upvotes: 3
Views: 2586
Reputation: 123521
If you just need to save the data for future processing I would not use the overhead of python and a database, but instead just use tshark or windump to save the data as fast as possible and with the least overhead into a single file. This is also the cheapest for HDD because you only append at the file. Later you could use python with winpcap or other tools to process the data without the pressure of loosing any data and write them in the format you need.
Upvotes: 0