Reputation: 1
I have a web page source and I have to extrat all the strings containing a series of characters(for example:"http://www.stackoverflow.com") and put them in another file. the "http:// doesn't change but the text that follow untill the closing " change!
I don't know nothing about scripting(sigh) and I tried some script(even on stackoverflow) but they do different things
this does a different thing, I tried to play with it without success
#!/bin/sh
while IFS=: read -r c1 c2; do
[[ $c1 == Node ]] && var=$c1
[[ $c1 == INFO ]] && echo "$var$c2"
done < file.txt
end
I expect il scans a web page source and put all the selected strings in a selected file
Upvotes: 0
Views: 363
Reputation: 24802
You can match text starting by "http://
up to the next double-quote by using the following regex :
"http://[^"]*"
The [^"]*
part matches any number of non-quote characters.
To extract such text from a file and write it to another, GNU grep
will do the trick :
grep -o '"http://[^"]*"' sourceFile > targetFile
The -o
option is used to make grep
only output the matched string rather than the whole line that contains it.
You can try it here.
Upvotes: 1