Reputation: 2270
The DPKT library says it supports Python3 now, but it has different behavior when I use it in Python 2.x vs 3.x. Although, both are incorrect it appears.
For example, in Python 2.x, the example given here
with open('test.pcap') as f:
pcap = dpkt.pcap.Reader(f)
for ts, buf in pcap:
eth = dpkt.ethernet.Ethernet(buf)
print eth
Returns a format that I don't expect, an object similar to:
^����6#���l�m�
Q!6�(�����k����~�pO���o���N�l �k4�'���8�9�j��@mf���5��pB�6bٌ�~p��Jf.Jܼ3H�:�ݭ�k-O7+�O��
4�(�9��^F�fb��V��t˜������\�X1��#�.�ج<�Q�!����>�^ɹDĀ�orC=bC���S�6;��SR�`�� �
ZD����j2Q���m����h��)1@��1���aw}�d�ڧn� ��
0Z:�`8ຄE(�@4���}������Mu��63fP�/�
������h'7�h'7�;������
However, in Python 3, I'm forced to open the pcap file in 'rb' mode, which is fine, except for the output issues (I'm not sure 'rb' has anything to do with the issues now):
with open('test.pcap', 'rb') as f:
pcap = dpkt.pcap.Reader(f)
for ts, buf in pcap:
eth = dpkt.ethernet.Ethernet(buf)
print eth
This now returns what I believe is a bytestring, and I haven't found a way to get the data out of this that I need. For example, if I needed the number of flags, I can easily get 17 from the above example from their site, but I can't seem to get their example to work at all:
b'\x00\x0f\x1f\x16\xd1\xcd\x00\xc0\xf0y\x9a\xfd\x08\x00E\x00\x00\x1c\xb1\xce\x00\x006\x01N\xf7\xc0\xa8\x01d\xc0\xa8\x01g\x08\x00\xd9\xd7\xb7\xc4fc'
I haven't had any luck converting this string into a human readable object. No combination of decode
, binascii
or anything else I've tried has worked. Am I using this library incorrectly?
Upvotes: 0
Views: 1455
Reputation: 495
try open the pcap-file as binary 'with open('test.pcap','rb')'
Upvotes: 0
Reputation: 12347
One of the major differences between python2 and python3 is that in python3, str
and bytes
are no longer the same. Compare:
$ python2 -c 'print(b"foo" == "foo")'
True
$ python3 -c 'print(b"foo" == "foo")'
False
This explains why you must open the file with "rb"
in python3. (Although it's quite likely that you would get bogus results if you didn't do so on some platforms with python2, because without the b
line endings that happen to exist in the file may get expanded inappropriately.)
Another difference: in python3, print
is a function, not a statement so the code you've shown above for python3 is actually a syntax error. Instead you need print(eth)
To answer your actual question: When you simply print eth
, you are implicitly asking the eth
object to make itself printable. That is the same as calling print(str(eth))
and so it's giving you a printable string version of the binary data buffer that contains the ethernet frame.
You need to use the facilities of dpkt
to discover, then dissect the parts of the frame that you care about.
Here's a short example that decodes a pcap containing DNS packets:
import dpkt
with open("/tmp/dns.pcap", "rb") as f:
pcap = dpkt.pcap.Reader(f)
for ts, buf in pcap:
l2 = dpkt.ethernet.Ethernet(buf)
print("Ethernet (L2) frame:", repr(l2))
if l2.type not in (dpkt.ethernet.ETH_TYPE_IP, dpkt.ethernet.ETH_TYPE_IP6):
print("Not an IP packet")
continue
l3 = l2.data
print("IP packet:", repr(l3))
if l3.p not in (dpkt.ip.IP_PROTO_TCP, dpkt.ip.IP_PROTO_UDP):
print("Not TCP or UDP")
continue
l4 = l3.data
print("Layer 4:", repr(l4))
if l4.dport in (53, 5353) or l4.sport in (53, 5353):
dns = l4.data
if not isinstance(dns, dpkt.dns.DNS):
dns = dpkt.dns.DNS(dns)
print("DNS packet:", repr(dns))
As for why your output looks different than the tutorial. The tutorial is out of date. Apparently at some point, the implementation of the __str__
magic method on the dpkt
objects changed (when you just print
an object, you get the result of its __str__
method).
Originally, __str__
returned a formatted representation of the object. Later it just returns a string representation of the raw bytes of the object. So now you need to call repr(obj)
in order to get the formatted representation.
Upvotes: 3