nodecentral
nodecentral

Reputation: 466

Lua - Extract IP address from any URL format

Is there a single string:match option (or alternative Lua command) that can ensure that irrespective of the URL/IP address format provided, I can always extract the IP address part ?

Here is the furthest I’ve got so far, but it does not present the full IP address .

local s1 = "192.168.19.55"
local s2 = "http://192.168.19.55"
local s3 = "http://192.168.219.55:88"
local s4 = "http://192.168.19.55:88/index.html"
local s5 = "https://192.168.119.102/hello.php"
local s6 = "http://admin:[email protected]:88/hello.php"

local ip = s6:match(".+(%d+%.%d+%.%d+%.%d+)")
print(ip)

Upvotes: 1

Views: 542

Answers (2)

koyaanisqatsi
koyaanisqatsi

Reputation: 2793

You can loop over it when you construct a table with the ip'.
Example ( Tested in: https://www.lua.org/cgi-bin/demo )

local ips = {"192.168.19.55",
"http://192.168.19.55",
"http://192.168.219.55:88",
"http://192.168.19.55:88/index.html",
"https://192.168.119.102/hello.php",
"http://admin:[email protected]:88/hello.php"}

for k, v in pairs(ips) do
  print(k, v:match("(.-%d+%.%d+%.%d+%.%d+%:-%d+)"))
end

...that puts out...

1   192.168.19.55
2   http://192.168.19.55
3   http://192.168.219.55:88
4   http://192.168.19.55:88
5   https://192.168.119.102
6   http://admin:[email protected]:88

The pattern items .- and %:-%d+ means: "a single character class followed by '-', which also matches sequences of zero or more characters in the class."
Source: https://www.lua.org/manual/5.4/manual.html#6.4.1

Play around with pattern items and also test it with more text in front of the url.
Like...

local ips = {"GET 192.168.19.55",
"GET http://192.168.19.55",
"GET http://192.168.219.55:88",
"GET http://192.168.19.55:88/index.html",
"POST https://192.168.119.102/hello.php",
"POST http://admin:[email protected]:88/hello.php"}

for _, v in pairs(ips) do
  print(v:match("(%g-%d+%.%d+%.%d+%.%d+%:-%d+)"))
end

(%g = Lua 5.3 and above - %w = Lua 5.1)

Upvotes: 0

Oleg V. Volkov
Oleg V. Volkov

Reputation: 22421

Your example already pretty much works, you just don't need leading .+ that eats extra symbols from front of IP.

local ip = s6:match("(%d+%.%d+%.%d+%.%d+)")

Still, this pattern is pretty loose and will match many other groups of 4 numbers separated by dots. You might want to at least limit each digit group to 3 digits. Decide if you need more robust pattern based on how important is that script you're writing and if people will try to exploit it by throwing bad data at it or not.

Upvotes: 2

Related Questions