SGResu
SGResu

Reputation: 1

How to extract strings from a file

I have a web page source and I have to extrat all the strings containing a series of characters(for example:"http://www.stackoverflow.com") and put them in another file. the "http:// doesn't change but the text that follow untill the closing " change!

I don't know nothing about scripting(sigh) and I tried some script(even on stackoverflow) but they do different things

this does a different thing, I tried to play with it without success

#!/bin/sh

while IFS=: read -r c1 c2; do
    [[ $c1 == Node ]] && var=$c1
    [[ $c1 == INFO ]] && echo "$var$c2"
done < file.txt
end

I expect il scans a web page source and put all the selected strings in a selected file

Upvotes: 0

Views: 363

Answers (1)

Aaron
Aaron

Reputation: 24802

You can match text starting by "http:// up to the next double-quote by using the following regex :

"http://[^"]*"

The [^"]* part matches any number of non-quote characters.

To extract such text from a file and write it to another, GNU grep will do the trick :

grep -o '"http://[^"]*"' sourceFile > targetFile

The -o option is used to make grep only output the matched string rather than the whole line that contains it.

You can try it here.

Upvotes: 1

Related Questions