Abhik Paul
Abhik Paul

Reputation: 49

Reading and Writing CSV file in R

I want to update specific rows in a CSV file that has dates with a data frame that I created in R.

01/04/20, Asset, Position, Price, Mark-to-Market
0, PORTFOLIO, NA, NA, 1000000
1, CASH, NA, NA, 1000000

02/04/20, Asset, Position, Price, Mark-to-Market, Position prior, Transaction, TC spent
0, PORTFOLIO, NA, NA, 999231, NA, NA, NA
1, CASH, NA, NA, 509866, NA, NA, NA
2, FUTURES, 500, 2516, 1258250, 0, 500, 629
3, VXc1, -5931, 47, -279795, 0, -5931, 140
 , Total, Buys:, 1, Sells:, 1, TC spent:, 769

There are approximately 1000+ rows.

However, I am unable to read this CSV file using the following codes. Can anyone help me with this?

df4 <- read.csv("filename.csv")

Further, I have to add two columns (2 and 3) from df3 mentioned below in the rows of df4 that have dates (except the first row). Can anyone help me with this as well?

The code to get df3 is as follows. However, I don't know how to add the rows to df4 selectively in R.

df1 <- read.csv("filename1.csv")
df2 <- read.csv("filename2.csv")
df3 <- cbind(df2[,c(1)], df1[,c(3)], df2[,c(3)])

Upvotes: 0

Views: 56

Answers (1)

r2evans
r2evans

Reputation: 161110

I'm not sure what you need for your second question, but to address the first:

txt <- readLines("filename.csv")
# Warning in readLines("filename.csv") :
#   incomplete final line found on 'filename.csv'
multidf <- by(txt, cumsum(!grepl("\\S", txt)),
              FUN = function(x) read.csv(text = x, strip.white = TRUE))
multidf
# cumsum(!grepl("\\S", txt)): 0
#   X01.04.20     Asset Position Price Mark.to.Market
# 1         0 PORTFOLIO       NA    NA        1000000
# 2         1      CASH       NA    NA        1000000
# --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- 
# cumsum(!grepl("\\S", txt)): 1
#   X02.04.20     Asset Position Price Mark.to.Market Position.prior Transaction TC.spent
# 1         0 PORTFOLIO     <NA>    NA         999231             NA        <NA>       NA
# 2         1      CASH     <NA>    NA         509866             NA        <NA>       NA
# 3         2   FUTURES      500  2516        1258250              0         500      629
# 4         3      VXc1    -5931    47        -279795              0       -5931      140
# 5        NA     Total    Buys:     1         Sells:              1   TC spent:      769

The multidf object is technically a "by"-class object, but that's really just a glorified list:

str(multidf)
# List of 2
#  $ 0:'data.frame':    2 obs. of  5 variables:
#   ..$ X01.04.20     : int [1:2] 0 1
#   ..$ Asset         : chr [1:2] "PORTFOLIO" "CASH"
#   ..$ Position      : logi [1:2] NA NA
#   ..$ Price         : logi [1:2] NA NA
#   ..$ Mark.to.Market: int [1:2] 1000000 1000000
#  $ 1:'data.frame':    5 obs. of  8 variables:
#   ..$ X02.04.20     : int [1:5] 0 1 2 3 NA
#   ..$ Asset         : chr [1:5] "PORTFOLIO" "CASH" "FUTURES" "VXc1" ...
#   ..$ Position      : chr [1:5] NA NA "500" "-5931" ...
#   ..$ Price         : int [1:5] NA NA 2516 47 1
#   ..$ Mark.to.Market: chr [1:5] "999231" "509866" "1258250" "-279795" ...
#   ..$ Position.prior: int [1:5] NA NA 0 0 1
#   ..$ Transaction   : chr [1:5] NA NA "500" "-5931" ...
#   ..$ TC.spent      : int [1:5] NA NA 629 140 769

From here, you can keep it as a list (can be good, see https://stackoverflow.com/a/24376207/3358227) or try to combine into a single frame (the same link has info for that, too).

Upvotes: 1

Related Questions