Reputation: 17
I have imported a .csv file into R. The files has several columns (I simplified to 4), and two of these columns--assigned
and completed
--should be dates, however, they are coming in as "character". I need them to be read as dates.
I have spent several hours searching and trying different things but cannot not find a solution. This is what the data looks like (first 3 rows, I have 5K rows total):
id assigned completed score
1: 54 11/10/16 11/10/16 0
2: 54 11/21/16 11/21/16 7
3: 54 1/26/17 1/26/17 11
> summary(data_subset)
id assigned completed
Min. : 54 Length:5991 Length:5991
1st Qu.: 1375 Class :character Class :character
Median : 1910 Mode :character Mode :character
Mean : 2145
3rd Qu.: 2199
Max. :10410
score
Min. : 0.00
1st Qu.: 4.00
Median : 7.00
Mean : 8.33
3rd Qu.:12.00
Max. :27.00
NA's :1
I tried lubridate
on the assigned
column but it overwrote all the values to NA.
library(lubridate)
data_subset$assigned <- mdy(data_subset$assigned)
id assigned completed score
1: 54 <NA> 11/10/16 0
2: 54 <NA> 11/21/16 7
3: 54 <NA> 1/26/17 11
I am looking for a way to make assigned
and completed
be read as dates--whether it happens during the .csv import, or through data manipulation after it's already in R.
Upvotes: 0
Views: 685
Reputation: 1979
Manipulation after importing approach:
data_subset$assigned <- as.Date(data_subset$assigned,'%m/%d/%y') # This uses base R
data_subset$completed <- as.Date(data_subset$completed,'%m/%d/%y') # The '%/m/%d/%y' specifies the format of your date
Sidenote: I have been working on a similar problem and lubridate
has been doing weird things lately. I suspect the reason may be in part to the version of R. lubridate
seems to work better on R 3.3.3 than on r-microsoft 3.3.3. I have had certain functions from the package missing on the r-mircosoft distribution. Perhaps some underlying function is missing which is causing everything to go to NA. Again this is just speculation, but maybe it leads to an answer.
Upvotes: 2