WorldsEndless
WorldsEndless

Reputation: 1494

Applying a set of consecutive regular expressions to multiple files (emacs)

I am working on a project that converts a few dozen html files into text files, and have composed the replace-regexp formulae that do the job. The question is, how to apply all six of them consecutively, and then to do so to each of the dozens of files in the directory? I've appended my org explanation that includes the regexp, but keep in mind that those aren't the problem; they do their job (after translating the ^J, etc). The question is just how to programatically apply all six of them to each (HTML) file in the directory?

* 1. Delete all until >General Conference<
\(.*^J\)*.*?General Conference
* 2. Delete all <p class="copyright"> and after
^.*<p class="copy\(.*^J\)*
* 3. Strip all tags
\(<.*?>\)*
* 4. Remove whitespace lines
^\s-*^J
* 5. Remove ugly numeric identifier
^\s-*[0-9].*^J
* 6. Remove amp 
&amp; -> &

Upvotes: 4

Views: 311

Answers (3)

Andreas R&#246;hler
Andreas R&#246;hler

Reputation: 4804

Seems just a step for you writing a function and applying it onto a files list.

Here's a draft starting it:

(defun my-replacements ()
  (interactive "*")
  (save-restriction
    (widen)
    (save-excursion
      (goto-char (point-min))
      (while (re-search-forward "FIRST-REGEXP" nil t 1)
        (replace-match "FIRST-REPLACEMENT"))

Repeat the last 3 lines until all the forms are covered.

Upvotes: 0

Michael Hoffman
Michael Hoffman

Reputation: 34324

  1. Open the directory with Dired: C-xC-ddirectoryRET
  2. Mark the files you want to change, either by pressing m (dired-mark) to mark each one individually, or some other mechanism in the Mark menu in the menu bar, like *.htmlRET (dired-mark-extension) to mark all files with an html extension.
  3. QregexRETRET (dired-do-query-replace-regexp) to replace any examples of regex with nothing. You can use Ωmega's regex for this.
  4. You can then either replace individual examples with SPC or all examples without asking further questions with !.

Upvotes: 4

event_jr
event_jr

Reputation: 17707

It wouldn't be hard to do this pragmatically. But the idiomatic Emacs solution is to record 2 keyboard macros.

  1. Perform each of your regexp replacements with replace-regexp in a single buffer.

  2. In a dired buffer,

    1. move to the next html (with C-s)
    2. open it in other window
    3. run (1) in other window and switch back to the dired buffer.

You would then run (2) with an absurd number C-u1000 or something.

Upvotes: 1

Related Questions