aquadhere
aquadhere

Reputation: 1

Split a batch of text files using pattern

I have a directory of almost a thousand html files. Each file needs to be split up into multiple text files, based on a recurring pattern (a heading). I am on a windows machine, using GnuWin32 tools.

I've found a way to do this, for a single file:

csplit 1.html -b "%04d.txt" /"Words in heading"/ {*}

But I don't know how to repeat this operation over the entire set of HTML files. This:

csplit *.html -b "%04d.txt" /"Words in heading"/ {*}

doesn't work, and neither does this:

for %i in (*.html) do csplit *.html -b "%04d.txt" /"Words in heading"/ {*}

Both result in an invalid pattern error. Help would be much appreciated!

Upvotes: 0

Views: 2537

Answers (1)

Micah Elliott
Micah Elliott

Reputation: 10274

The options/arguments order is important with csplit. And it won’t accept multiple files. It’s help gets you there:

% csplit --help
Usage: csplit [OPTION]... FILE PATTERN...

I’m surprised your first example works for the single file. It really should be changed to:

% csplit  -b "%04d.txt"  1.html  "/Words in heading/" "{*}"
          ^^^^^^^^^^^^^  ^^^^^^  ^^^^^^^^^^^^^^^^^^^^^^^^^^
            OPTS/ARGS     FILE    PATTERNS

Notice also that I changed your your quoting to be around the arguments. You probably also need to have quoted your last "{*}".

I’m not sure what shell you’re using, but if that for-loop syntax is appropriate, then the fixed command should work in the loop.

Upvotes: 1

Related Questions