SKR
SKR

Reputation: 127

Sed/Awk to search and replace/insert text in files

I am trying to update or insert few comments like Copyright headers in to all my source files in a directory (Linux). My files are inconsistent, so that a few of them already have headers while others do not have them at all. I tried with sed to look at the first few lines and replace. Replace I mean change the files which are already having Copyright header with latest one.

sed -e '1,10 s/Copyright/*Copyright*/g' file

But, this will not insert if it did not find the pattern. How can I achieve this?

Example I provided in comments or what I am trying to actually replace/insert is a multiline typical copyright header as follows

/*
* Copyright 1234 XXXNAME, XYZPlace 
*  text text text text ...........
* blah blah blah */

It may contain some special characters also.

Upvotes: 3

Views: 11210

Answers (3)

ghoti
ghoti

Reputation: 46876

If I understand correctly, you want to:

  • Find files without a Copyright notice in the first 10 lines, and
  • Add a Copyright notice to those files.

In addition, you want to:

  • Find files WITH a Copyright notice in the first 10 lines, and
  • Update their notice to your standard text.

It seems to me that these two tasks could be boiled down to a single set:

  • Remove any existing Copyright notice in the first 10 lines, then
  • Insert a new Copyright notice into the file.

If we can safely assume that a shortened version of the sampletext you put in a comment on your question is valid, and should be inserted at, for example, line 2 of each file, then the following should achieve the very first set of requirements if you're using GNU sed:

find . -type f -not -exec grep -q Copyright {} \; -exec sed -i'' '2i/* Copyright */' {} \;

If you're not running GNU sed (i.e. you're in FreeBSD or OSX or Solaris, etc), let us know, because the sed script will be different.

How does this work?

The find command is getting the following options:

  • -type f tells it to look only at files (not directories or devices).
  • -not inverts the following option.
  • -exec grep -q Copyright {} \; limits the search to anything with Copyright in it (modified by -not)
  • -exec sed -i'' '2i/* Copyright */' {} \; inserts your copyright notice.

This solution may run into difficulty if you want your copyright notice to include special characters that would be interpreted by the sed script. But it answers your question. :)

If instead, we want to handle the revised requirements, i.e. remove existing copyright notices first, then we can do this with two one-liners:

First, we remove existing copyright notices.

find . -type f -exec sh -c 'head {} | grep -q Copyright' \; -exec sed -ne '10,$ta;/Copyright/d;:a;p' {} \;

This may be a little redundant, unless you want to traverse subdirectories recursively, which find does by default. The sed script does nothing to files that have no Copyright info in the first 10 lines, so the following should also work instead, if all your files are in one directory:

for file in *;do sed -ne '10,$ta;/Copyright/d;:a;p' "$file"; done

Next, we add new ones back in.

for file in *;do sed -i'' '2i/* Copyright */' "$file"; done

Or, if you want to do this recursively through subdirectories:

find . -type f -exec sed -i'' '2i/* Copyright */' {} \;

FINAL UPDATE:

I can't spend more time on this one after this.

find . -type f \
  -exec sh -c 'head {} | grep -q Copyright' \; \
  -exec sed -ne '1h;1!H;${;g;s:/\*.*Copyright.*\*/:/* Copyright 1998-2012 */' {} \;

What?

The first -exec searches for the word "Copyright" in the first 10 lines of the file. Just like the first example I posted, above. If grep finds anything, this condition returns true.

The second -exec does the substitution. It reads the entire file into sed's hold buffer. Then when it gets to the end of the file, it (g) considers the hold buffer, and (s) does a multi-line substitution.

Note that this may very well require some tuning, and it may not work at all if you have comments elsewhere in the file. I don't recall whether GNU sed supports non-greedy stars. You can research that yourself.

Here's my test:

$ printf 'one\n/* Copyright blah blah\n *\n */\ntwo\n' | sed -n '1h;1!H;${;g;s:/\*.*Copyright.*\*/:/* Copyright 1998-2012 */:g;p;}'
one
/* Copyright 1998-2012 */
two

This doesn't maintain your existing Copyright information, but at least it addresses the multi-line issue.

Upvotes: 9

Lev Levitsky
Lev Levitsky

Reputation: 65821

Edit: the command below won't work if you have file names with spaces, see the first comment.

It can for sure be done with sed only, but the first thing that came to my mind is to do the substitution on files where the line is present and then add the header to the rest of the files using something like

for f in $(grep -lv 'Copyright' *); do sed -i '1i *Copyright*' $f; done

That will work for all files in the current folder, use the -r option to grep if you need recursion.

P.S. I suggest removing the -i sed option for testing and adding it only when you're sure the command works right.

Upvotes: 0

William Pursell
William Pursell

Reputation: 212494

To insert the single line containing the text copyright at line 1 of a file only if it isn't already there, you could do:

sed '1{ /copyright/!i\
copyright
}' input-file

To insert multiple lines:

sed '1{ /copyright/!i\
copyright\
second line
}' input-file

It's tempting to use r to read the copyright from a file, but I cannot figure out how to insert it before line 1 rather than after line 1. eg:

sed '1{ /copyright/! { x; r copyright-file
G}}' input-file

Seems like it ought to do the trick, but the text from the copyright-file winds up starting at line 2.

Upvotes: 0

Related Questions