Reputation: 501
I got an image with an URL like:
<img alt="" src="http://www.example-site.com/folder_with_underscore/folder-with-dash/3635/0/235/NumBerS_and_Uc/image.png" />
I'm using sed "s///g"
So what I'm trying is to replace the src value but this is most of the time totally different.
Is there a way to use sed "s/src=\" (until first " ) / new url /g"
Extra info:
I'm using Cygwin on Windows
and PATH=C:\cygwin\bin
in my .bat file
Upvotes: 0
Views: 2625
Reputation: 212664
Shawn's solution is mostly correct, but it does not deal with the case in which a newline appears in the src url. sed
is really not very good at dealing with such cases, but you can hack a solution:
sed '/src/{
/src="[^"]*"/{ s//src="NEWURL"/; n; }
s/src=".*$/src="NEWURL"/
p
:a
s/.*//;
N
/"/!ba
s/[^"]*"//
}
' input
Note that many of the newlines above are superfluous in some versions of sed, but necessary in others. (In particular, the newline after :a
and after the branch command, as some versions of sed will terminate the label only at the newline. I believe that versions of sed which allow a label to terminate with a semi-colon are not strictly compliant with the standard, but it is a common practice.) This script does the simple replacement where appropriate, but if a quote is not found following src="
, it enters a loop deleting lines until a terminating "
is seen. This is an ugly solution, and I recommend against using sed for parsing xml.
Upvotes: 1
Reputation: 86974
[^"]
will match any charater apart from "
, so you can use:
sed 's/src="[^"]*"/src="NEWURL"/g'
Example:
[me@home]$ echo '<img alt="" src="http://www.example-site.com/folder_with_underscore/folder-with-dash/3635/0/235/NumBerS_and_Uc/image.png" />' | sed 's/src="[^"]*"/src="http:\/\/stackoverflow.com"/g'
<img alt="" src="http://stackoverflow.com" />
Note that that will match till the first occurence of "
which is probably what you want. If you really want to match till the last occurence of "
, you could simply do:
sed 's/src=".*"/src="NEWURL"/g'
The regex is greedy and so will take up as many charactes as possibly, thus matching till the last occurence of "
. While this will also work in the example above, it will not behave as expected if there are other contents within your input that also contain "
.
Upvotes: 5