Yi Wen Edwin Ang
Yi Wen Edwin Ang

Reputation: 55

R - reshape using gather on dates

I have the below in csv file and being read into R.

    1/10/2018   1/11/2018
AUD 42984.41    42984.41
HKD 214412.85   214412.85
INR 1862292.95  1862292.95
KRW 86.00   86.00
NZD 37144.18    37144.18
PHP 409894.32   409894.32
SGD 241245.08   241245.08
THB 148725.77   148725.77
TWD 2169925.00  2169925.00
USD 873297.93   873297.93


> cash_forecast = read.csv("cash_forecast.csv", header=TRUE, stringsAsFactors=TRUE)
> cash_forecast
     X X1.10.2018 X1.11.2018
1  AUD   42984.41   42984.41
2  HKD  214412.85  214412.85
3  INR 1862292.95 1862292.95
4  KRW      86.00      86.00
5  NZD   37144.18   37144.18
6  PHP  409894.32  409894.32
7  SGD  241245.08  241245.08
8  THB  148725.77  148725.77
9  TWD 2169925.00 2169925.00
10 USD  873297.93  873297.93

I do not know why there is a X being added to each column, i.e. X X1.10.2018 X1.11.2018

I did a reshape using gather and the column, Day has X appended to the dates. Any idea why is this so?

> no_of_col <- ncol(cash_forecast) #no of columns
> long_cash_forecast <- cash_forecast %>% gather(Day, Balance, 2:no_of_col)
> long_cash_forecast
     X        Day    Balance
1  AUD X1.10.2018   42984.41
2  HKD X1.10.2018  214412.85
3  INR X1.10.2018 1862292.95
4  KRW X1.10.2018      86.00
5  NZD X1.10.2018   37144.18
6  PHP X1.10.2018  409894.32
7  SGD X1.10.2018  241245.08
8  THB X1.10.2018  148725.77
9  TWD X1.10.2018 2169925.00
10 USD X1.10.2018  873297.93
11 AUD X1.11.2018   42984.41
12 HKD X1.11.2018  214412.85
13 INR X1.11.2018 1862292.95
14 KRW X1.11.2018      86.00
15 NZD X1.11.2018   37144.18
16 PHP X1.11.2018  409894.32
17 SGD X1.11.2018  241245.08
18 THB X1.11.2018  148725.77
19 TWD X1.11.2018 2169925.00
20 USD X1.11.2018  873297.93

Upvotes: 1

Views: 318

Answers (1)

Gregor Thomas
Gregor Thomas

Reputation: 145755

It's recommended that column names follow the same rules as object names - don't start with a number, no special symbols that could be confused with math operations (-, /, *, +, etc.), no spaces....

When you use read.csv (or any base importing functions), by default it runs a function called make.names to fix up the names. You can stop that by setting check.names = FALSE in read.csv. See the help page ?read.csv for details.

That said, you don't need to skip the names check. Almost surely, your next step should be to convert your column to a proper Date class, which is as easy to do with the X as without the X:

# with the X
as.Date(long_cash_forecast$Day, format = "X%d.%M.%Y")

# without the X, and with / instead of . 
as.Date(long_cash_forecast$Day, format = "%d/%M/%Y")

Upvotes: 2

Related Questions