coelhudo
coelhudo

Reputation: 5080

How can I filter a pcap file by specific protocol using python?

I have some pcap files and I want to filter by protocol, i.e., if I want to filter by HTTP protocol, anything but HTTP packets will remain in the pcap file.

There is a tool called openDPI, and it's perfect for what I need, but there is no wrapper for python language.

Does anyone knows any python modules that can do what I need?

Thanks

Edit 1:

HTTP filtering was just an example, there is a lot of protocols that I want to filter.

Edit 2:

I tried Scapy, but I don't figure how to filter correctly. The filter only accepts Berkeley Packet Filter expression, i.e., I can't apply a msn, or HTTP, or another specific filter from upper layer. Can anyone help me?

Upvotes: 12

Views: 56426

Answers (8)

InvisibleWolf
InvisibleWolf

Reputation: 1017

Here is my example of pcap parsing using scapy. It also has some relevant code for performance testing and some other stuff.

Upvotes: 0

Yasser Arafat
Yasser Arafat

Reputation: 61

sniff supports a offline option wherein you can provide the pcap file as input. This way you can use the filtering advantages of sniff command on pcap file.

>>> packets = sniff(offline='mypackets.pcap')
>>>
>>> packets
<Sniffed: TCP:17 UDP:0 ICMP:0 Other:0>

Hope that helps !

Upvotes: 6

Abhinav
Abhinav

Reputation: 1042

I have tried the same using @nmichaels method, but it becomes cumbersome when I want to iterate it over multiple protocols. I tried finding ways to read the .pcap file and then filter it but found no help. Basically, when one reads a .pcap file there is no function in Scapy which allows to filter these packets, on the other hand using a command like,

a=sniff(filter="tcp and ( port 25 or port 110 )",prn=lambda x: x.sprintf("%IP.src%:%TCP.sport% -> %IP.dst%:%TCP.dport%  %2s,TCP.flags% : %TCP.payload%"))

helps to filter but only while sniffing.

If anyone knows of any other method where we can use a BPF syntax instead of the for statement?

Upvotes: 2

J.J.
J.J.

Reputation: 5069

I know this is a super-old question, but I just ran across it thought I'd provide my answer. This is a problem I've encountered several times over the years, and I keep finding myself falling back to dpkt. Originally from the very capable dugsong, dpkt is primarily a packet creation/parsing library. I get the sense the pcap parsing was an afterthought, but it turns out to be a very useful one, because parsing pcaps, IP, TCP and and TCP headers is straightforward. It's parsing all the higher-level protocols that becomes the time sink! (I wrote my own python pcap parsing library before finding dpkt)

The documentation on using the pcap parsing functionality is a little thin. Here's an example from my files:

import socket
import dpkt
import sys
pcapReader = dpkt.pcap.Reader(file(sys.argv[1], "rb"))
for ts, data in pcapReader:
    ether = dpkt.ethernet.Ethernet(data)
    if ether.type != dpkt.ethernet.ETH_TYPE_IP: raise
    ip = ether.data
    src = socket.inet_ntoa(ip.src)
    dst = socket.inet_ntoa(ip.dst)
    print "%s -> %s" % (src, dst)

Hope this helps the next guy to run across this post!

Upvotes: 14

vinit
vinit

Reputation: 21

to filter in/out a specific protocol you have to do a per packet analysis otherwise you could miss some http traffic on a non-conventional port that is flowing in your network. of course if you want a loose system, you could check just for source and destination port number but that wont give you exact results. you have to look for specific feature of a protocol like GET, POST, HEAD etc keywords for HTTP and others for other protocol and check each TCP packets.

Upvotes: 2

nmichaels
nmichaels

Reputation: 50943

A quick example using Scapy, since I just wrote one:

pkts = rdpcap('packets.pcap')
ports = [80, 25]
filtered = (pkt for pkt in pkts if
    TCP in pkt and
    (pkt[TCP].sport in ports or pkt[TCP].dport in ports))
wrpcap('filtered.pcap', filtered)

That will filter out packets that are neither HTTP nor SMTP. If you want all the packets but HTTP and SMTP, the third line should be:

filtered = (pkt for pkt in pkts if
    not (TCP in pkt and
    (pkt[TCP].sport in ports or pkt[TCP].dport in ports)))
wrpcap('filtered.pcap', filtered)

Upvotes: 20

fraca7
fraca7

Reputation: 1188

Something along the lines of

from pcapy import open_offline
from impacket.ImpactDecoder import EthDecoder
from impacket.ImpactPacket import IP, TCP, UDP, ICMP

decoder = EthDecoder()

def callback(jdr, data):
    packet = decoder.decode(data)
    child = packet.child()
    if isinstance(child, IP):
        child = packet.child()
        if isinstance(child, TCP):
            if child.get_th_dport() == 80:
                print 'HTTP'

pcap = open_offline('net.cap')
pcap.loop(0, callback)

using

http://oss.coresecurity.com/projects/impacket.html

Upvotes: 4

Dave Bacher
Dave Bacher

Reputation: 15972

Try pylibpcap.

Upvotes: 3

Related Questions