Reputation: 1437
I have an csv file representing bank transactions that looks like this:
"Date","Description","Original Description","Amount","Transaction Type","Category","Account Name","Labels","Notes"
"10/18/2012","Amazon","AMAZON.COM","27.60","debit","Shopping","CHASE COLLEGE","",""
"10/19/2012","Virgin America","VIRGIN AMER","155.90","debit","Air Travel","CREDIT CARD","",""
"10/20/2012","Airport Express","AIR EXP","16.00","credit","Credit Card Payment","CREDIT CARD","",""
I am trying to convert it so that the value in column 4 is +/- depending on the value of column 5. If 5 says "debit" the value of col 4 should be "-", if it says "credit", the value of col 4 should be "+"
So the output would be something like this:
"Date","Description","Original Description","Amount","Transaction Type","Category","Account Name","Labels","Notes"
"10/18/2012","Amazon","AMAZON.COM","-27.60","debit","Shopping","CHASE COLLEGE","",""
"10/19/2012","Virgin America","VIRGIN AMER","-155.90","debit","Air Travel","CREDIT CARD","",""
"10/20/2012","Airport Express","AIR EXP","16.00","credit","Credit Card Payment","CREDIT CARD","",""
What would be the best way to do this? I had considered writing a MATLAB program with If statements to read in the file, but I would like to know if there is a simple way to do it from terminal, such as using AWK or RegEx in Vim!
Upvotes: 1
Views: 1060
Reputation: 378
Can actually be done using VIM regex, but little complicated.
First, we identify the rows containing de word "debit", with a character like k
for example:
%s/\(^.*debit.*$\)/k\1
Second, apply minus sign to the 4th column
%s/^k\(.\{-},.\)\{3}\zs\(.\{-},\)/-\2
Finally, reverse changes to identified rows
%s/^k//
Upvotes: 0
Reputation: 56059
Technically, you could do it with a regex (or at least a pair of them), but awk
is much better suited. In general, awk isn't great with quoted fields, but since every field is quoted and we don't need to work with the first field, we can work around that.
awk 'BEGIN{FS=OFS="\",\""}$5=="credit"{$4="+"$4}$5=="debit"{$4="-"$4}1' file.csv
Explanation
awk '
BEGIN {
# Set the input and output field separators to ",", *with* the quotes.
FS=OFS="\",\""
}
# On every line where field 5 is "credit" ...
$5 == "credit" {
# ... Prepend "+" to the fourth field.
$4="+"$4
}
# On every line where the fifth field is "debit" ...
$5 == "debit" {
# ... Prepend "-" to the fourth field.
$4="-"$4
}
# Print the line
1
' test.in
Upvotes: 3