raininglemons
raininglemons

Reputation: 460

Split large file with awk, syntax error with regex

Got a large file with lots of xml doc's concatenated together.

Trying to split them with the following command;

awk '/<\?xml/{g++} { print $0 > "ipg130101-"g".txt"}' ipg130101.xml

But keep getting the error back;

 context is
/<\?xml/{g++} { print $0 > >>>  "ipg130101-"g <<< ".txt"}
awk: illegal statement at source line 1

Any help much appreciated!!

Upvotes: 0

Views: 346

Answers (3)

Ed Morton
Ed Morton

Reputation: 203807

The problem is that this statement:

print $0 > "ipg130101-"g".txt"

is ambiguous. It can mean:

(print $0 > "ipg130101-" g); ".txt"

or

(print $0 > "ipg130101-"); g ".txt"

or any other variation. For portability you MUST parenthesize whatever's on the right side of output redirection, i.e. explicitly write:

print $0 > ("ipg130101-"g".txt")

You don't need the $0, by the way, this would work fine:

print > ("ipg130101-"g".txt")

Upvotes: 2

Gilles Qu&#233;not
Gilles Qu&#233;not

Reputation: 185434

One solution is to use explicitly instead of (the latter is the default one on MAc Os X).

So finally :

gawk '/<\?xml/{g++} { print $0 > "ipg130101-"g".txt"}' ipg130101.xml

Upvotes: 3

raininglemons
raininglemons

Reputation: 460

Found a solution, looks like mac doesn't like it unless you put the filename as a variable first.

Splitting a file using AWK on Mac OS X

awk '/<\?xml/{g++} {filename = "ipg130101-"g".txt"; print >filename}' ipg130101.xml

Upvotes: 2

Related Questions