Reputation: 1
I have a file with a list of links, but they have unusable leads and trails I need to get rid of.
Specifically <img src="place-holder" />
.
The file is full of 20-odd similar links with all that garbage, I just need place-holder
.
If this was a single link that'd be easy;
link='${link#*"}' && link='${%"*}' && echo $link
There's probably a way to do this in a single command but I don't know how, also I don't deal with lists in files
So question is; How do I get rid of anything outside and including the quotes in a list within a file to then iterate over?
So far I haven't got a clue even though I've been searching for a few hours now. Since these are all equal length I think a sed operation getting rid of things in a set position might be possible
Upvotes: 0
Views: 71
Reputation: 9905
This makes some assumptions that:
echo 'asdfasf <img src="placeholder" asfd="asdfasdf" /> frog' \
| sed -E 's/.*< *img[^>]*src="([^"]*)"[^>]*\/>.*/\1/g'
result:
placeholder
Regex breakdown:
.*
stuff before tag< *img
tag opening, optional spaces[^>]*
as many non->
characters as possiblesrc=
src attribute"([^"]*)"
capture everything inside the quotes[^>]*\/>
the rest of the tag.*
stuff after the tagIf your data is as simple as <img src="place-holder" />
, perhaps the folowing will work for you
echo '<img src="place-holder" />' | sed -E 's/.*"(.*)".*/\1/g'
output:
placeholder
sed
If you must:
echo '<img src="place-holder" />' | cut -d'"' -f2
Why not:
echo '<img src="place-holder" />' | awk -v FS='"' '{ print $2 }'
I mean, I suppose...
for tag in '<img src="place-holder" />' '<img src="place2" />'; do echo $tag; done | \
sed -E 's/"/\\"\\"/g' | xargs -I {} echo -e \
"Result := RegExReplace(\"{}\`n\",\"^.*\"\"(.*)\"\".*$\",\"\$1\`n\")\n" \
"FileAppend %Result%, *" | pwsh.exe -c "AutoHotkey.exe * | echo"
I... I can't believe I actually got that last one to work...
Upvotes: 1