Reputation: 4653
I am looking at finding this pattern
<!-- Footer part at bottom of page-->
<div id="footer">
<div class="row col-md-2 col-md-offset-5">
<p class="text-muted">© 2014. Core Team</p>
</div>
<div id="downloadlinks">
<!-- downloadlinks go here-->
</div>
</div>
and replacing it with this pattern for a number of .html files
<!-- Footer part at bottom of page-->
<div id="footer">
<div class="row col-md-2 col-md-offset-5">
<?php
$year = date("Y");
echo "<p class='text-muted'>© $year. Core Team</p>";
?>
</div>
<div id="downloadlinks">
<!-- downloadlinks go here-->
</div>
</div>
Note the difference is that this
<p class="text-muted">© 2014. Core Team</p>
is replaced with
<?php
$year = date("Y");
echo "<p class='text-muted'>© $year. Core Team</p>";
?>
I was looking at doing it with sed
but having had an initial attempt, my difficulty is the characters I might or might or might not have to escape. Also the tabs or new lines in the php code, I would like that to appear as is here.
There is a number of files to do it to so I would like to automate it but it might be quicker to just do it manually(copy and paste). But maybe sed
is the wrong approach in this instance. Can someone kindly direct me in the right direction? At this stage I am open to other languages (e.g. php, python, bash ) to find a solution.
I would then plan to rename each .html file to .php with the following:
for i in *.html; do mv "$i" "${i%.*}.php"; done;
bsed on the awk answer below I can get it to work under this version
$ awk -Wversion 2>/dev/null || awk --version
GNU Awk 4.1.1, API: 1.1 (GNU MPFR 3.1.2, GNU MP 6.0.0)
Copyright (C) 1989, 1991-2014 Free Software Foundation.
however on this version I get different output. It seems it prints out the 3 files, old new and file. Is this easily rectified in this version?
root@4461f768e343:/github/find_pattern# awk -Wversion 2>/dev/null || awk --version
mawk 1.3.3 Nov 1996, Copyright (C) Michael D. Brennan
root@4461f768e343:/github/find_pattern#
root@4461f768e343:/github/find_pattern#
root@4461f768e343:/github/find_pattern# awk -v RS='^$' -v ORS= 'ARGIND==1{old=$0;next} ARGIND==2{new=$0;next} s=index($0,old){ $0 = substr($0,1,s-1) new substr($0,s+length(old))} 1' old new file
<!-- Footer part at bottom of page-->
<div id="footer">
<div class="row col-md-2 col-md-offset-5">
<p class="text-muted">© 2014. Core Team</p>
</div>
<div id="downloadlinks">
<!-- downloadlinks go here-->
</div>
</div><!-- Footer part at bottom of page-->
<div id="footer">
<div class="row col-md-2 col-md-offset-5">
<?php
$year = date("Y");
echo "<p class='text-muted'>© $year. Core Team</p>";
?>
</div>
<div id="downloadlinks">
<!-- downloadlinks go here-->
</div>
</div>some pile of text
or other
<!-- Footer part at bottom of page-->
<div id="footer">
<div class="row col-md-2 col-md-offset-5">
<p class="text-muted">© 2014. Core Team</p>
</div>
<div id="downloadlinks">
<!-- downloadlinks go here-->
</div>
</div>
and more maybe.root@4461f768e343:/github/find_pattern#
Upvotes: 0
Views: 75
Reputation: 203995
sed is for simple substitutions on individual lines so your task is certainly not a job for sed. You could use awk if your files are all that well formatted:
$ cat old
<!-- Footer part at bottom of page-->
<div id="footer">
<div class="row col-md-2 col-md-offset-5">
<p class="text-muted">© 2014. Core Team</p>
</div>
<div id="downloadlinks">
<!-- downloadlinks go here-->
</div>
</div>
.
$ cat new
<!-- Footer part at bottom of page-->
<div id="footer">
<div class="row col-md-2 col-md-offset-5">
<?php
$year = date("Y");
echo "<p class='text-muted'>© $year. Core Team</p>";
?>
</div>
<div id="downloadlinks">
<!-- downloadlinks go here-->
</div>
</div>
.
$ cat file
some pile of text
or other
<!-- Footer part at bottom of page-->
<div id="footer">
<div class="row col-md-2 col-md-offset-5">
<p class="text-muted">© 2014. Core Team</p>
</div>
<div id="downloadlinks">
<!-- downloadlinks go here-->
</div>
</div>
and more maybe.
.
$ awk -v RS='^$' -v ORS= 'ARGIND==1{old=$0;next} ARGIND==2{new=$0;next} s=index($0,old){ $0 = substr($0,1,s-1) new substr($0,s+length(old))} 1' old new file
some pile of text
or other
<!-- Footer part at bottom of page-->
<div id="footer">
<div class="row col-md-2 col-md-offset-5">
<?php
$year = date("Y");
echo "<p class='text-muted'>© $year. Core Team</p>";
?>
</div>
<div id="downloadlinks">
<!-- downloadlinks go here-->
</div>
</div>
and more maybe.
The above uses GNU awk for multi-char RS and ARGIND. If you want to do it for many files you could use:
find . -type f -name '*.php' -exec awk -i inplace -v RS='^$' -v ORS= 'ARGIND==1{old=$0;print;next} ARGIND==2{new=$0;print;next} s=index($0,old){ $0 = substr($0,1,s-1) new substr($0,s+length(old))} 1' old new {} \;
or similar.
Upvotes: 2
Reputation: 1046
You can use replace
.
html_files = ['a.html', ...]
copyright = '<p class="text-muted">© 2014. Core Team</p>'
new_copyright = """ <?php
$year = date("Y");
echo "<p class='text-muted'>© $year. Core Team</p>";
?>"""
for html_file_path in html_files:
with open(html_file_path) as html_file:
html = html_file.read()
if copyright in html:
php_file_path = html_file_path.replace('.html', '.php')
with open(php_file_path, "w") as php_file:
php = html.replace(copyright, new_copyright)
php_file.write(php)
Note this will not override your html files which is useful if the script has an error.
Upvotes: 2