Reputation: 19
I have a .csv file which contains three elements that I want to further separate. The rows in the file look like this:
gene_id "ENSDARG00000104632", gene_version "2", gene_name "RERG"
gene_id "ENSDARG00000104632", gene_version "2", transcript_id "ENSDART00000166186"
gene_id "ENSDARG00000104632", gene_version "2", transcript_id "ENSDART00000166186"
I want to take the strings in " " and make them into their own elements separated by ,
Basically I want it to look like this:
gene_id, ENSDARG00000104632, gene_version, 2, gene_name, RERG
gene_id, ENSDARG00000104632, gene_version, 2, transcript_id, ENSDART00000166186
gene_id, ENSDARG00000104632, gene_version, 2, transcript_id, ENSDART00000166186
I had thought to do it like this:
awk 'BEGIN{OFS=",";FS="""};{print $1,$2,$3,$4,$5,$6}'
However, it seems AWK cannot recognize " as a delimiter. Does anyone have a recommendation as to how to achieve this?
Upvotes: 0
Views: 51
Reputation: 203209
$ awk -F'[ ",]+' -v OFS=', ' '{sub(/"$/,""); $1=$1} 1' file
gene_id, ENSDARG00000104632, gene_version, 2, gene_name, RERG
gene_id, ENSDARG00000104632, gene_version, 2, transcript_id, ENSDART00000166186
gene_id, ENSDARG00000104632, gene_version, 2, transcript_id, ENSDART00000166186
Upvotes: 2