Reputation: 27
Getting rather desperate to understand how to get the data I want from a curl
command.
I need a hand with generating a grep
command to get the following html
:
<title> timetable </t itle>< <h3>study table</h3> <p>< strong>biology <div> <table
width='100%' cellpadding='5' cellspacing='0'><tr><th colspan="3">Level 44 Building 1 <tr>
<td >monday</td> <td >1:30 – 2:30</td> <td >< a>Room number 22</a></td> <td > </td>
</tr> <tr><th colspan="2">body> </html>
I would like the output look like:
timetable
study table
Biology
Level 44 Building 1
Monday
1:30 - 2:30
Room Number 22
Currently I only know how to do a single grep
such as :
grep 'href='
Upvotes: 0
Views: 879
Reputation: 85785
If you have GNU grep
:
$ grep -Po '(?<=>) ?\K[^<&>]{2,}(?=<)' file
timetable
study table
biology
Level 44 Building 1
monday
1:30 – 2:30
Room number 22
Disclaimer: You should really use a proper parser for this.
Upvotes: 1
Reputation: 8819
Assuming your string is in the variable $data
, you can:
IFS=$'\n'
result=$(echo $data | sed 's/&[^;]*;//')
result=$(echo $result | sed 's/<[^>]*>/\n/g')
for string in $result; do
if [[ ! $string =~ ^\ *$ ]]; then
echo "string=$string."
fi
done
Upvotes: 0