Reputation: 313
I'm trying to understand UDP hole punching and I just don't quite get it. In concept it seems simple but when I put it into practice I can't pull it off. From what I understand there's a public server we call the hole-punch server. A client makes a request to hole-punch server (this is public). The hole-punch server spits out a public ip and port of the client that just made the request. So long as that port is open then essentially any random client can make a request to that client using that specific port and ip ?
The issue I guess I'm having is, the client is able to make a request to the server. The server is able to send data back to the client on that public port and ip however when another client tries to send a request to that client using that same port and ip it just doesn't go through and that's what's confusing me. If the server can make the request why can't another random client make that request?
Upvotes: 8
Views: 10929
Reputation: 19431
TL;DR: from application development perspective you first send an UDP packet to a server. This server then publishes the IP:port pairs it saw to the other peers. Then the application can start sending UDP packets to the learned tuples to contact them. The application must be ready to handle incoming packets from peers that wasn't on this list reported by the server due to symmetric NAT. Packets may be dropped until the other peer initiates connection too due to stateful firewalls. If two users are behind symmetric NAT, no direct communication between them is possible.
So in your particular case the other peer cannot send you packets because you are either behind a symmetric NAT or stateful firewall.
Longer answer:
Let me explain it a bit differently than usual.
In order to avoid having to use many example port and IP address pairs and avoid visual clutter. I'll refer to the IP:port tuple as endpoint. The endpoints the same host (same IP different port) is simply referred to as Host1, Host2, etc.
When the host (let's call it Local) sends a packet to a computer on the internet (let's call it Remote), the packet will contain the endpoints Local1 → Remote1.
In a typical home networking scenario, there is a LAN that uses private IP addresses. So the Local1 has a private IP that is not routable/reachable from the internet. The only internet reachable IP address is given to the home router (let's call it Router). This router has to rewrite the source endpoint of the packet into one of its own in order make it a valid packet, so the endpoints in the packet that goes into the internet will say: Router1 → Remote1. The Router also keeps track of the mapping (Local1, Remote1). So when the remote computer answers such that the router sees the response packet with endpoints in the header: Remote1 → Router1, it can translate it back to Remote1 → Local1 and send it back to Local, so it receives the answer. When the application sends another packet from the the Local host using the Local1 endpoint, it will again be translated to Remote1 and the cycle starts again. This is called Network address and port translation or NAPT. But usually referred to as NAT.
If an unsolicited packet arrives at the router at a different endpoint let's call it Router2, and the router does not have a mapping for it, the packet is dropped. The user can also configure persistent mappings, colloquially called "opening ports" or "port forwarding". The user can also designate a computer where all unsolicited packets are forwarded, which is called "DMZ".
Since the Router1 endpoint is created when the first packet is sent, the application has no way to know what endpoint the router will assign to it. So the application on LAN needs to send a packet to a server that can echo the endpoint back to the application and other participants to connect. Methaphorically the first packet "punches a hole" so the response can come in.
Let's assume we have two users Alice and Bob, both of them have their own local networks and routers. Let's assume they use Server to learn each other's endpoints.
Alice sends first the packet that has the endpoints: Alice1 → Server1, Alice's router translates this to AliceNat1 → Server1. And her router keeps the mapping (Alice1, AliceNat1). Bob does the same, so he sends: Bob1 → Server1, his router translates this to BobNat1 → Server1, and keeps track of the mapping (Bob1, BobNat1). The server sees the endpoints AliceNat1 and BobNat1, and sends these back to them, so their application can learn it. Now Alice can send Alice1 → BobNat1, the router will use the mapping to translate this to AliceNat1 → BobNat1, Bob receives this and his router translates this to AliceNat1 → Bob1, then Bob's application receives the packet. Bob responds, the packet has the endpoints: Bob1 → AliceNat1, his router translates: BobNat1 → AliceNat1, Alice's router translates back: BobNat1 → Alice1 and Alice receives the response.
Now keep in mind that Alice's and Bob's application doesn't need to know or care about that such translation takes place. If Bob's computer is directly on the internet, then the packet header would stay at Bob1, that would be seen by the server and that would be reported to Alice to connect to. So no extra code needs to be written to implement the hole punching, it happens implicitly by simply connecting to a server that reports back the peer endpoints it sees.
All the above works only if, the router's tracking is based on the LAN host's endpoint, but not the destination. This NAT is called "full cone" NAT.
In large LANs where the router implements NAT, the router can easily run out of ports when many users use the internet. So in order to alleviate this, the mapping also contains the destination, which allows the NAT to reuse the same endpoint for multiple outgoing connections possibly for multiple different LAN endpoints. The carrier grade NATs of ISPs tend be this kind of type because they allocate a small port range for each of their subscribers, so they can identify which subscriber communicates. If a NAT does this, then the method described above would not work. The Alice1 → Server1 packet would create the mapping on the NAT (Alice1 → Server1, AliceNat1 → Server1), and only a packet with endpoints Server1 → AliceNat1 would get translated back to Server1 → Alice1 and delivered to Alice, Bob's packet with BobNat1 → AliceNat1 would get discarded, because the router does not have an AliceNat1 → BobNat1 mapping. Now even if Alice tries to send a packet with Alice1 → BobNat1, the router might create a different mapping, let's say (Alice1 → BobNat1, AliceNat2 → BobNat1), so packets with BobNat1 → AliceNat1 still cannot come in. A NAT that does this is called "symmetric NAT".
Keep in mind if Bob's NAT is full cone, his application would still receive the packet from the endpoint AliceNat2, because packets directed to the BobNat1 endpoint will be routed to the Bob1 endpoint. So the application needs to be prepared to see packets from peers it doesn't know about and use some kind of authentication to identify them instead of relying on the endpoint reported by the server.
So users behind symmetric NATs cannot have incoming connections (except forwarded ports), so hole punching doesn't work for them, full cone NAT users can have incoming connections from peers after the public endpoint is learned from a server, users with routable IP addresses or forwarded ports can be contacted directly, because their endpoint (the IP and port to connect to) is known in advance.
But there are not only NATs there are also stateful firewalls that keep track of connections like a symmetric NAT would do, but they don't translate the source endpoint just send the packet. Let's assume this time Alice and Bob are behind a firewall like this (common with IPv6). Alice sends packet to Bob, so Alice's firewall learns the connection (Alice1 → Bob1), Bob's firewall doesn't know about Alice1, so the packet is dropped. Bob also initiates connection in the other direction so his firewall learns about (Bob1 → Alice1) and since Alice also has the mapping the packet is received and communications can continue. The packets are discarded until the other side also initiates communication.
Upvotes: 0
Reputation: 73081
The thing to know about UDP hole-punching is that many consumer-grade Internet routers/NAT-firewalls have a policy along the lines of "block any incoming UDP packets, except for UDP packets coming from an IP address that the user's local computer has recently sent a UDP packet to"; the idea being that if the local user is sending packets to a particular IP address, then the packets coming back from that same IP address are probably legitimate/desirable.
So in order to get UDP packets flowing between two firewalled/NAT'd computers, you have to get each of the two computers to first send a UDP packet to the other one; which is a bit of a chicken-and-egg problem since they can't know where to send the UDP packet without being able to communicate; the public server is what solves that problem. Since that server is public, both clients can communicate with the server (via UDP or TCP or HTTP or whatever), and that server can tell each client the IP address and port to send its UDP packets to. Once each client has sent some initial packets to the other, it should also (in most cases) then be able to receive UDP packets from the other client as well, at which point the server is no longer necessary as a go-between.
Upvotes: 32