Hubert Hu
Hubert Hu

Reputation: 1

Why does pyshark sniff_continuously() lost a great amount of packets compared with Wireshark Gui?

I want to write a function which can count the total length of data sent to specific ip addresses, running with async. But it turns out that the function can only counts 50%-60% of the total data. In case the problem comes from async, I wrote a simplified test program to see if the sniff_continuously works properly. But seems like the total data it counts is also 50-60% of the number from wireshark. I counts the number of packets by ip, and also the sum of packet length for each ip. Both differ from the results of Wireshark. And since I know the exact amount of file I'm uploading for test, the results of Wireshark seems more correct because it's closer to the size of file.

Here is the test file I wrote:

import pyshark
def live():
    capture = pyshark.LiveCapture(interface='any')
    total_data_sent_by_ip = {}
    ip_frequency_count = {}
    print("--------------------------------------------started!")
    for packet in capture.sniff_continuously():
        try:
            if hasattr(packet,'ip'):
                dst = packet.ip.dst
                if dst in total_data_sent_by_ip:
                    total_data_sent_by_ip[dst] += int(packet.length)
                    ip_frequency_count[dst] += 1
                else:
                    total_data_sent_by_ip[dst] = int(packet.length)
                    ip_frequency_count[dst] = 1
        except AttributeError as e:
            print(f"Attribute error:{e}")
        except Exception as e:
            print(f"other error:{e}")

        with open("test_result.txt", "w") as file:
            file.write(str(total_data_sent_by_ip)+'\n' +str(ip_frequency_count))
live()

When I run my upload client to upload a file, I ran this test file together with Wireshark, without any filter applied. And when the upload is complete, I stopped both python and Wireshark. I used pandas to deal with the Wireshark csv, also counted sum length of packet for every ip.

I believe the results from both ways should be at least almost the same, considering there's a small difference of time when I started each program one by one. But actually a huge difference is shown, not only for the upload destination ip, but also for other ips.

For example, the test python said there're totally 24,532 packets sending to destination A, with total length of 113Mb, but for Wireshark it's 52,387 packets with 230Mb. The proportion of missing packets for each destination is not the same, and also when I rerun both programs the proportion behavior differed as well. So I cannot find a certain pattern in it.

I want to know how did this amount of missing packets happen and how I can fix it. Any help would be appreciated. Thanks!

Upvotes: 0

Views: 50

Answers (0)

Related Questions