Reputation: 1494
I am working on a project that converts a few dozen html files into text files, and have composed the replace-regexp formulae that do the job. The question is, how to apply all six of them consecutively, and then to do so to each of the dozens of files in the directory? I've appended my org explanation that includes the regexp, but keep in mind that those aren't the problem; they do their job (after translating the ^J, etc). The question is just how to programatically apply all six of them to each (HTML) file in the directory?
* 1. Delete all until >General Conference<
\(.*^J\)*.*?General Conference
* 2. Delete all <p class="copyright"> and after
^.*<p class="copy\(.*^J\)*
* 3. Strip all tags
\(<.*?>\)*
* 4. Remove whitespace lines
^\s-*^J
* 5. Remove ugly numeric identifier
^\s-*[0-9].*^J
* 6. Remove amp
& -> &
Upvotes: 4
Views: 311
Reputation: 4804
Seems just a step for you writing a function and applying it onto a files list.
Here's a draft starting it:
(defun my-replacements ()
(interactive "*")
(save-restriction
(widen)
(save-excursion
(goto-char (point-min))
(while (re-search-forward "FIRST-REGEXP" nil t 1)
(replace-match "FIRST-REPLACEMENT"))
Repeat the last 3 lines until all the forms are covered.
Upvotes: 0
Reputation: 34324
dired-mark
) to mark each one individually, or some other mechanism in the Mark menu in the menu bar, like *.html
RET (dired-mark-extension
) to mark all files with an html
extension.dired-do-query-replace-regexp
) to replace any examples of regex with nothing. You can use Ωmega's regex for this.Upvotes: 4
Reputation: 17707
It wouldn't be hard to do this pragmatically. But the idiomatic Emacs solution is to record 2 keyboard macros.
Perform each of your regexp replacements with replace-regexp
in a single
buffer.
In a dired buffer,
You would then run (2) with an absurd number C-u1000 or something.
Upvotes: 1