Reputation: 1075
I have a java properties file that looks like the following:
SiteUrlEndpoint=google.com/mySite
I want to use sed -i to inline replace the url but keep the context path that comes out of it. So for example if I wanted to change the properties file above to use amazon.com then the result would look like:
SiteUrlEndpoint=amazon.com/mySite
I am having trouble with sed to only replace the url and keeping the context path when replacing it inline.
My attempt:
sed -i 's:^[ \t]*siteUrlEndpoint[ \t]*=\([ \t]*.*\)[/]*$:siteUrlEndpoint = 'amazon.com':' file
Upvotes: 1
Views: 1765
Reputation: 203995
Keep it simple:
$ sed -E 's/(SiteUrlEndpoint=)[^.]+/\1amazon/' file
SiteUrlEndpoint=amazon.com/mySite
Upvotes: 0
Reputation: 827
Keep a backreference to the part just before the domain - then match and replace the domain - you can add the -i option after verifying the output of the sed command
url=amazon.com
sed -r 's/\b(SiteUrlEndpoint\s*=\s*)[^/]+/\1'$url'/'
Upvotes: 0
Reputation: 84579
You can do it with two backreferences, e.g.
sed -i.bak 's|^\(SiteUrlEndpoint=\).*/\(.*\)|\1amazon.com/\2|' file
note: the match of text up to /
is greedy. If you have multiple parts of the path following the domain, you probably want to preserve all path components. To make it non-greedy, you could use the following instead
sed -i.bak 's|^\(SiteUrlEndpoint=\)[^/]*/\(.*\)|\1amazon.com/\2|' file
(you can add i.bak
to create a backup of the original in file.bak
)
To accomplish the same thing, you can match SiteUrlEndpoint=
at the beginning of the line first, and then use a single backreference for the change, e.g.
sed -i.bak '/^SiteUrlEndpoint=/s|=[^/]*\(/.*\)|=amazon.com\1|' file
For example, given a file sites
containing:
$ cat sites
SiteUrlEndpoint=google.com/path/to/mySite
SiteUrlSomeOther=google.com/mySite
You can change google.com
to amazon.com
with (using non-greedy form of first example):
$ sed -i 's|^\(SiteUrlEndpoint=\)[^/]*/\(.*\)|\1amazon.com/\2|' sites
Confirming:
$ cat sites
SiteUrlEndpoint=amazon.com/path/to/mySite
SiteUrlSomeOther=google.com/mySite
and
$ cat sites.bak
SiteUrlEndpoint=google.com/path/to/mySite
SiteUrlSomeOther=google.com/mySite
Explanation (first form)
sed -i.bak 's|^\(SiteUrlEndpoint=\)
- locate & save
SiteUrlEndpoint=
[^/]*/
- match any folowing characters up to first /
(non-greedy -
adjust as needed)\(.*\)
- match and save anything following /
|\1amazon.com/\2|'
- full replacement (explanation below)\1
- first back-reference containing SiteUrlEndpoint=
amazon.com
- self-explanatory/\2
- the '/'
second back-reference of everything that followed.Look over all the solutions and let me know if you have questions.
Upvotes: 3
Reputation: 8140
Regular expressions are hard, especially with complex regular expressions and/or large input files where unexpected changes are to be avoided.
Therefore I strongly recommend using sed -i.bak
to keep a backup of the original file to then run a diff
on both of them to see what changed.
Assuming that
siteUrlEndpoint
(case insensitive)amazon.com
while leaving the path intactI came up with this solution:
sed -i.bak 's;^\([ \t]*siteurlendpoint[ \t]*=[ \t]*\)[^/]*\(.*\);\1amazon.com\2;Ig' infile
I used a semicolon instead of your colon, that's just my preference when I don't want to use /
;)
Then I wrapped both the leading white spaces and siteurlendpoint
as well as everything from the first /
onwards into brackets \( \)
so that I can take them again in the replacement with \1
and \2
. That way I keep the indentation and the capitalisation of SiteUrlEndpoint
intact.
For the search options I added an I
to the g
to make the search case insensitive. I am not sure how standard this option is, you might have to see whether your sed
understands it.
The actual part that I want to replace I have just any character not including the next /
: [^/]*
As for your line:
siteUrlEndpoint
with lower case s. Since in your examples you wrote it with capital S, it wouldn't have triggered.[/]*$
doesn't make any sense at all. "This line can end in zero or more of any of these caracters: /
."[/]*$
with .*
which means: zero or more of any character at all.'amazon.com'
might interfere with the single quotes around the whole search/replace term. It seems to work, but it is sloppy, and will fail if there are ever any spaces in there. It doesn't seem to serve any purpose anyway (except if you want to replace amazon.com
with some environment variable like $NEWSITE
) so I don't know why you're doing that. Upvotes: 0