ashish_k
ashish_k

Reputation: 1581

How to search results using awk command in .CSV file

I've a CDR file(.CSV) which contains around 150 columns and is a very large file. I'm trying to get the output where the 31st column should have value "13".

I'm trying with below command:

awk -F',' '$31~/^13/' report_1.csv > report_2.csv

But getting the following error:

awk: record `,1402786,535,1,47432... has too many fields record number 1`

Any help?

Upvotes: 0

Views: 2140

Answers (3)

randomir
randomir

Reputation: 18697

The limit on number of fields shouldn't be so low as 150, so I'm guessing you're probably not parsing your CSV file properly.

If a particular, you should not split on just any comma - you should avoid splitting on , within quoted fields ("like,this").

If you're using GNU awk, proper CSV parsing is pretty simple via FPAT (according to this excellent answer by @Ed Morton):

awk -v FPAT='[^,]*|"[^"]+"' '$31 ~ /^13/' file

or, for an exact match:

awk -v FPAT='[^,]*|"[^"]+"' '$31 == "13"' file

In a non-GNU awk case, refer to the cited answer for an alternative parsing method.

Upvotes: 1

hek2mgl
hek2mgl

Reputation: 158060

Some implementations of awk come with a maximum number of columns. mawk for example. You can test this easily by assigning to NF, like this:

$ mawk 'BEGIN{NF=32768}'
mawk: program limit exceeded: maximum number of fields size=32767
        FILENAME="" FNR=0 NR=0

To walk around this, you can use GNU awk, gawk, which does not have such an explicit limit.

$ gawk 'BEGIN{NF=32768}'
$ gawk 'BEGIN{NF=1000000}'

Well, it is still limited by the amount of available memory. (But that should allow you to have at least millions of fields on a normal pc).

PS: You might need to install gawk and of course, processing such large files might be slow.

Upvotes: 0

Cyrus
Cyrus

Reputation: 88684

I suggest:

awk -F',' '$31 == "13"' report_1.csv > report_2.csv

Upvotes: 1

Related Questions