Reputation: 91
I got 10,000 text files which I have to make changes.
First line on every file contains a url.
By mistake for few files url missking 'com'
eg:
1) http://www.supersonic./psychology
2) http://www.supersonic./social
3) http://www.supersonic.com/science
my task is to check and add 'com' if it is missing
eg:
1) http://www.supersonic.com/psychology
2) http://www.supersonic.com/social
3) http://www.supersonic.com/science
all urls are of same domain(supersonic.com)
can you suggest me any fast and easy approach ?
Tried this : replacing supersonic./
with supersonic.com
sed -e '1s/supersonic.//supersonic.com/' *
no change in the output.
Upvotes: 0
Views: 804
Reputation: 37288
You are very close with your code, but you need to account for the trailing /
char after the .
char.
Assuming you are using a modern sed
with the -i
(inplace-edit) option you can do
sed -i '1s@supersonic\./@supersonic.com/@' *
Note that rather than have to escape /
inside of the s/srchpat\/withSlash/replaceStr/'
, you can use another char after the the s
command as the delimiter, here I use s@...@...@
. If your search pattern had a @
char, then you would have to use a different char.
Some older versions of sed
need to you to escape the alternate delimiter at the first use, so
sed 's\@srchStr@ReplStr@' file
for those cases.
If you're using a sed
that doesn't support the -i
options, then
you'll need to loop on your file, and manage the tmp files, i.e.
for f in *.html ; do
sed '1s@supersonic\./@supersonic.com/@' "$f" > /tmp/"$f".fix \
&& /bin/mv /tmp/"$f".fix "$f"
done
Warning
But as you're talking about 10,000+files, you'll want to do some testing before using either of these solutions. Copy a good random set of those files to /tmp/mySedTest/ dir and run one of these solutions there to make sure there are no surprises.
And you're likely to blow out the cmd-line MAX_SIZE with 10,000+ files, so read about find and xargs. There are many posts here about [sed] find xargs
. Check them out if needed.
IHTH
Upvotes: 1
Reputation: 241918
Use -i
to change the files instead of just outputting the changed lines.
Use a different delimiter than /
if you want to use /
in the regex (or use \/
in the regex).
Use \.
to match a dot literally, .
matches anything.
sed -i~ -e '1s=supersonic\./=supersonic.com/=' *
Some versions of sed
don't support -i
.
Upvotes: 2