CheckYourSec
CheckYourSec

Reputation: 143

Regex not working in curl output

Im using curl to send a POST request in debian linux terminal and its working properly, This is the curl command:

curl --data "ping=8.8.8.8" -s http://www.ipvoid.com/ping/

Now i want to capture the content between the <textarea> tags by executing this command:

curl --data "ping=8.8.8.8" -s http://www.ipvoid.com/ping/ | grep -ioE '<textarea.*>(.*(\n.*)*)<\/textarea>' 

But it returns nothing. I tested the regex and it works properly:

regex101.com

Is the problem with the regex or grep syntax?

Upvotes: 1

Views: 725

Answers (2)

hanshenrik
hanshenrik

Reputation: 21463

by default, grep parses the input individually per line, and your textarea has newlines in it, thus your regex doesn't work. but you can (ab)use the --null-data parameter, then it will separate the input by NULL bytes instead of newlines, and since there's no NULL bytes in your textarea, it works!

curl --data "ping=8.8.8.8" -s http://www.ipvoid.com/ping/ | grep -ioE '<textarea.*>(.*(\n.*)*)<\/textarea>' --null-data

(but i recommend using a proper HTML parser instead, the xmllint recommended by @RomanPerekhrest would probably be a better solution, if it's available to you)

Upvotes: 0

RomanPerekhrest
RomanPerekhrest

Reputation: 92854

Since the result of the crucial HTTP request is HTML document the right way is to apply xml/html parsers.

xmllint is one of such:

curl -d "ping=8.8.8.8" -s http://www.ipvoid.com/ping/ \
| xmllint --html --xpath '//textarea/text()' - 2>/dev/null

The output:

PING 8.8.8.8 (8.8.8.8) 56(84) bytes of data.
64 bytes from 8.8.8.8: icmp_seq=1 ttl=61 time=1.12 ms
64 bytes from 8.8.8.8: icmp_seq=2 ttl=61 time=1.05 ms
64 bytes from 8.8.8.8: icmp_seq=3 ttl=61 time=1.14 ms

--- 8.8.8.8 ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 2002ms
rtt min/avg/max/mdev = 1.052/1.107/1.144/0.039 ms

http://xmlsoft.org/xmllint.html

Upvotes: 2

Related Questions