kopelkan
kopelkan

Reputation: 1195

Sed on big multiple files fails (37kB each)

I've tried this on multiple small files, everything works fine. But when testing sed on multiple files with size 37kB each, only one file completely processed while other files become horrible.

Below is the codes Im running:

find ./ -type f -name '*.html' | xargs sed -i 's/<title>/sblmtitle\n<title>/g' &&
find ./ -type f -name '*.html' | xargs sed -i '1,/sblmtitle/d' &&
find ./ -type f -name '*.html' | xargs sed -i 's/<div class="entry entry-cont"/\n<div class="entry entry-cont"/g' &&
find ./ -type f -name '*.html' | xargs sed -i -n '/<div class="entry entry-cont"/q;p' &&
find ./ -type f -name '*.html' | xargs sed -i 's/<\/title>/<\/title>\nslpstitle/g' &&
find ./ -type f -name '*.html' | xargs sed -i 's/<h1><a href="/sblmurl\n<link>/g' &&
find ./ -type f -name '*.html' | xargs sed -i '/slpstitle/,/sblmurl/d' &&
find ./ -type f -name '*.html' | xargs sed -i '/<link>/s/">/<\/link>\nslpsurl/g' &&
find ./ -type f -name '*.html' | xargs sed -i 's/<div id="down" class="entry entry-cont">/sblmkonten\n<div id="down" class="entry entry-cont">\ndeldlmkonten/g' &&
find ./ -type f -name '*.html' | xargs sed -i '/slpsurl/,/sblmkonten/d' &&
find ./ -type f -name '*.html' | xargs sed -i '/deldlmkonten/,/<iframe/d' &&
find ./ -type f -name '*.html' | xargs sed -i 's/<div id="down" class="entry entry-cont">/<description>/g' &&
find ./ -type f -name '*.html' | xargs sed -i '$s/$/<\/description>/' &&
find ./ -type f -name '*.html' | xargs sed -i 's%​%%g' &&
find ./ -type f -name '*.html' | xargs sed -i '/^$/d'

Is there anything I'm missing?

Upvotes: 2

Views: 287

Answers (1)

anubhava
anubhava

Reputation: 785128

I would have say that this is pretty inefficient. You are finding same set og *.html files every time and running some sed command. Why don't you combine multiple sed commands into 1 big sed command like:

sed -e 's/<title>/sblmtitle\n<title>/g' -e '1,/sblmtitle/d' ....

And do all processing in 1 single find command like this:

find ./ -type f -name '*.html' | xargs sed -i.bak ....

Upvotes: 2

Related Questions