ev350
ev350

Reputation: 439

Performing a recursive find and replace with sed only changes first file

I am trying to recursively search the current directory, performing a sed replace on the first line of each .txt file found.

Running either of these 2 commands, on MacOS:

find . -name "*.txt" -exec sed -i '' '1 s/([^()]*)//g' {} + 
find . -name '*.txt' -print0 | xargs -0 sed -i '' '1 s/([^()]*)//g'

leads to the same result. Only the "first" file found has the sed operation performed on it. This appears to be because of the 1 in sed -i '' '1 s/([^()]*)//g'. The weird thing is that even though this causes only the first file to be used, it also still only performs the sed replace on the first line of this file; which it should.

If I change the command to this

find . -name '*.txt' -print0 | xargs -0 sed -i '' '2 s/([^()]*)//g'

it is still only the first file that is changed, but now the second line has the replacement. My question, then, is why does this only appear to affect the first file returned by

find . -name '*.txt' -print0

Edit for Clarification

I should clarify what exactly I mean by only the "first" file has the sed operation performed on it by recreating the problem step by step. First,

This is the folder hierarchy (note the space in "folder 1"):

.
├── folder\ 1
│   └── test1.txt
├── folder2
│   └── test2.txt
├── folder3
│   └── test3.txt
└── folder4
    └── test4.txt

Each .txt file contains exactly this, and only this, one line:

This should stay (this should go)

When running either of the commands above, it is the file test2.txt that is changed, and the problem is that it is the only file that is changed!

So now, the files contain the following:

test1.txt: This should stay (this should go)

test2.txt: This should stay

test3.txt: This should stay (this should go)

test4.txt: This should stay (this should go)

I believe this is because the first part of the command, for example

find . -name '*.txt' -print0

gives the following (each separated by a \0 null character)

./folder2/test2.txt./folder3/test3.txt./folder4/test4.txt./folder 1/test1.txt

By changing the folder and file names around randomly, it is clear that it is always the first file in the above \0 delimited list that is changed.

So the question remains, what is it about the call to sed that prevents it being called on ALL of the files?

Thanks!

Upvotes: 2

Views: 247

Answers (1)

tshiono
tshiono

Reputation: 22022

I suppose the question about the 1st command is answered by Beta and let me answer the 2nd one.

Try to put -t (test) option to xargs and see how the command line is expanded:

find . -name '*.txt' -print0 | xargs -0 -t sed -i '' '1 s/([^()]*)//g'

It will output something like:

sed -i '' 1 s/([^()]*)//g ./test1.txt ./test2.txt ./test3.txt ./test4.txt

The default behavior of xargs is to execute the specified command (sed in this case) at once with the all arguments read from the standard input.
In addition sed doesn't reset line numbering across multiple input files and the s command above will be applied for the 1st file only.

You can change the behavior of xargs with -l1 option:

find . -name '*.txt' -print0 | xargs -0 -l1 -t sed '1 s/([^()]*)//g'

Output:

sed -i '' 1 s/([^()]*)//g ./test1.txt
sed -i '' 1 s/([^()]*)//g ./test2.txt
sed -i '' 1 s/([^()]*)//g ./test3.txt
sed -i '' 1 s/([^()]*)//g ./test4.txt

Then sed will work as expected.

Upvotes: 2

Related Questions