Dr. Coomer
Dr. Coomer

Reputation: 1

How to strip beginning and end of a line in a file

I have a file with a list of links, but they have unusable leads and trails I need to get rid of.

Specifically <img src="place-holder" />.

The file is full of 20-odd similar links with all that garbage, I just need place-holder.

If this was a single link that'd be easy;

link='${link#*"}' && link='${%"*}' && echo $link

There's probably a way to do this in a single command but I don't know how, also I don't deal with lists in files

So question is; How do I get rid of anything outside and including the quotes in a list within a file to then iterate over?

So far I haven't got a clue even though I've been searching for a few hours now. Since these are all equal length I think a sed operation getting rid of things in a set position might be possible

Upvotes: 0

Views: 71

Answers (1)

xdhmoore
xdhmoore

Reputation: 9905

Complex data

This makes some assumptions that:

  • tags/attributes are lower case
  • there is only one img tag per line
  • the whole img tag is on one line
echo 'asdfasf <img src="placeholder" asfd="asdfasdf" /> frog' \
     | sed -E 's/.*< *img[^>]*src="([^"]*)"[^>]*\/>.*/\1/g'

result:

placeholder

Regex breakdown:

  • .* stuff before tag
  • < *img tag opening, optional spaces
  • [^>]* as many non-> characters as possible
  • src= src attribute
  • "([^"]*)" capture everything inside the quotes
  • [^>]*\/> the rest of the tag
  • .* stuff after the tag

Simple data

If your data is as simple as <img src="place-holder" />, perhaps the folowing will work for you

echo '<img src="place-holder" />' | sed -E 's/.*"(.*)".*/\1/g'

output:

placeholder

I hate sed

If you must:

echo '<img src="place-holder" />' | cut -d'"' -f2

I could go all day

Why not:

 echo '<img src="place-holder" />' | awk -v FS='"' '{ print $2 }'

AutoHotkey?

I mean, I suppose...

for tag in '<img src="place-holder" />' '<img src="place2" />'; do echo $tag; done | \
    sed -E 's/"/\\"\\"/g' | xargs -I {} echo -e \
    "Result := RegExReplace(\"{}\`n\",\"^.*\"\"(.*)\"\".*$\",\"\$1\`n\")\n" \
    "FileAppend %Result%, *" | pwsh.exe -c "AutoHotkey.exe * | echo"

I... I can't believe I actually got that last one to work...

Upvotes: 1

Related Questions