Reputation: 143
Im using curl to send a POST request in debian linux terminal and its working properly, This is the curl command:
curl --data "ping=8.8.8.8" -s http://www.ipvoid.com/ping/
Now i want to capture the content between the <textarea>
tags by executing this command:
curl --data "ping=8.8.8.8" -s http://www.ipvoid.com/ping/ | grep -ioE '<textarea.*>(.*(\n.*)*)<\/textarea>'
But it returns nothing. I tested the regex and it works properly:
Is the problem with the regex or grep syntax?
Upvotes: 1
Views: 725
Reputation: 21463
by default, grep parses the input individually per line, and your textarea has newlines in it, thus your regex doesn't work. but you can (ab)use the --null-data
parameter, then it will separate the input by NULL bytes instead of newlines, and since there's no NULL bytes in your textarea, it works!
curl --data "ping=8.8.8.8" -s http://www.ipvoid.com/ping/ | grep -ioE '<textarea.*>(.*(\n.*)*)<\/textarea>' --null-data
(but i recommend using a proper HTML parser instead, the xmllint recommended by @RomanPerekhrest would probably be a better solution, if it's available to you)
Upvotes: 0
Reputation: 92854
Since the result of the crucial HTTP request is HTML document the right way is to apply xml/html parsers.
xmllint
is one of such:
curl -d "ping=8.8.8.8" -s http://www.ipvoid.com/ping/ \
| xmllint --html --xpath '//textarea/text()' - 2>/dev/null
The output:
PING 8.8.8.8 (8.8.8.8) 56(84) bytes of data.
64 bytes from 8.8.8.8: icmp_seq=1 ttl=61 time=1.12 ms
64 bytes from 8.8.8.8: icmp_seq=2 ttl=61 time=1.05 ms
64 bytes from 8.8.8.8: icmp_seq=3 ttl=61 time=1.14 ms
--- 8.8.8.8 ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 2002ms
rtt min/avg/max/mdev = 1.052/1.107/1.144/0.039 ms
http://xmlsoft.org/xmllint.html
Upvotes: 2