wtrdk
wtrdk

Reputation: 141

Find which fields in CSV are over X characters

I have a CSV file that I parse with a self written bash-script. The contents of a field in lets say the second column may not contain more than 50 characters. How can I find those fields and list them, including their line number? And can I trim them to 50 characters?

For example:

100245;this field may not contain more than fifty characters;12;Y

should be shortened to

100245;this field may not contain more than fifty charac;12;Y

Thank you for your help.

Upvotes: 1

Views: 96

Answers (3)

anubhava
anubhava

Reputation: 785491

You can use:

awk -v len=50 'BEGIN{FS=OFS=";"} length($2)>len {$2=substr($2, 1, len)} 1' file

This will find all fields greater than length in argument (50) and trim those fields down to 50 using substr function.

Upvotes: 2

fedorqui
fedorqui

Reputation: 290025

Use printf with a 50 length:

$ awk 'BEGIN{FS=OFS=";"} {$2=sprintf("%.50s", $2)}1' file
100245;this field may not contain more than fifty charact;12;Y
100245;this field may not ters;12;Y

From awk's guide - Modifiers for printf Formats:

.prec

    %s

        Maximum number of characters from the string that should print. 

Other examples:

$ echo "asdfasdf" | awk '{printf "%.10s\n", $1}'
asdfasdf
$ echo "asdfasdf" | awk '{printf "%.5s\n", $1}'
asdfa

Upvotes: 1

Avinash Raj
Avinash Raj

Reputation: 174776

Through sed,

$ sed 's/^\([^;]*;[^;]\{49\}\)[^;]*/\1/' file
100245;this field may not contain more than fifty charac;12;Y

Upvotes: 0

Related Questions