Reputation: 8025
I have a CSV file with 3 columns:
id,text,date
123,hi 你好吗?,2016-01-01
246,this is stackoverflow 我需要帮忙,2016-02-01
I want to only edit column 2 where i remove only english characters and keep the chinese ones. The other columns remain untouched.
Output i want:
id,text,date
123,你好吗?,2016-01-01
246,我需要帮忙,2016-02-01
Is there a better way to do it than this:
cat myfile.csv|cut -d, -f2|sed 's/[a-zA-Z]*//g' > tmp.csv
paste -d, myfile.csv tmp.csv|awk -F, '{OFS=",";print $1,$7,$3}' >tmp2.csv
Upvotes: 4
Views: 197
Reputation: 19733
awk 'NR==1{print;}NR>1{gsub(/[a-zA-Z ]+/,"");print;}' your_file
id,text,date
123,你好吗?,2016-01-01
246,我需要帮忙,2016-02-01
Upvotes: 0
Reputation: 203229
If the script you posted at the bottom of your question works for you then so will this:
awk 'BEGIN{FS=OFS=","} NR>1{gsub(/[a-zA-Z]/,"",$2)} 1' file
You said "characters" though, not "letters", so YMMV.
Upvotes: 2
Reputation: 1456
awk -F, '{ s=split($2,t," "); sub($2, t[s]); print }' file
id,text,date
123,你好吗?,2016-01-01
246,我需要帮忙,2016-02-01
Upvotes: 0
Reputation: 12772
awk -F, 'BEGIN {OFS=","} { if (NR>1) {gsub(/[\x00-\x7F]/, "", $2)}; print }' test.txt
NR>1
: don't operate on first linegsub(/[\x00-\x7F]/, "", $2)
: get rid of ascii characters in column 2. docUpvotes: 3