Peter Chen
Peter Chen

Reputation: 1484

R append new rows based on original rows

I am new in R and I want to add lots of new rows based on original one.
First, I show my part of data:

Type    Time            PM
Ac      2016-01         7710
Ac      2016-02         27500
Ac      2016-03         12840
Ac      2016-04         20060
Ac      2016-05         9430
Ac      2016-06         9830
Ac      2016-07         3870
Ac      2016-08         12040
Ac      2016-09         5590
Ac      2016-10         4300
Ac      2016-11         3870
Ac      2016-12         2580
As      2016-01         315052
As      2016-02         478542
As      2016-03         491472
As      2016-04         662238
As      2016-05         464728
As      2016-06         530424
As      2016-07         443122
As      2016-08         603512
As      2016-09         564438
As      2016-10         440068
As      2016-11         462846
As      2016-12         525394
HW      2015-12           7250

Second, what I want is like following(I subset Type Ac, Time 2016-01 & 2016-03 & 2016-10 as example):

Type    Time    Time2   LapseMonth  PM      What I want
Ac  2016-01     2016-01     0       7710    315.9836066
Ac  2016-01     2016-02     1       0       610.9016393
Ac  2016-01     2016-03     2       0       653.0327869
Ac  2016-01     2016-04     3       0       631.9672131
Ac  2016-01     2016-05     4       0       653.0327869
Ac  2016-01     2016-06     5       0       631.9672131
Ac  2016-01     2016-07     6       0       653.0327869
Ac  2016-01     2016-08     7       0       653.0327869
Ac  2016-01     2016-09     8       0       631.9672131
Ac  2016-01     2016-10     9       0       653.0327869
Ac  2016-01     2016-11     10      0       631.9672131
Ac  2016-01     2016-12     11      0       653.0327869
Ac  2016-03     2016-03     0       12840   526.2295082
Ac  2016-03     2016-04     1       0       1052.459016
Ac  2016-03     2016-05     2       0       1087.540984
Ac  2016-03     2016-06     3       0       1052.459016
Ac  2016-03     2016-07     4       0       1087.541
Ac  2016-03     2016-08     5       0       1087.541
Ac  2016-03     2016-09     6       0       1052.459
Ac  2016-03     2016-10     7       0       1087.541
Ac  2016-03     2016-11     8       0       1052.459
Ac  2016-03     2016-12     9       0       1087.541
Ac  2016-10     2016-10     0       4300    176.2295082
Ac  2016-10     2016-11     1       0       352.4590164
Ac  2016-10     2016-12     2       0       364.2076503

I explain the result I want:
1. Time2 is a new column which is the month equal and after Time.
2. LapseMonth is also a new column which is equal to month(Time2) - month(Time). Thus, it generates as above.
3. Column What I want is complicated.
Look at row1(Time:2016-01, Time2:2016-01, LapseMonth:0, PM:7710, What I want:315.9836066)
When Time = Time1, the formula of What I want is 15/366*PM, which PM=7710
And the following is the day of that month/366*PM (i.e. row2:610.9016393 = 29/366*7710, riow3:653.0327869 = 31/366*7710)

I add each row by row, but it is inefficient when there are many type.
I think maybe I can use for-loop or ifelse to figure out. However, I still have no idea to begin.

I add my code below. Since my data contains multiple excel, so I read it at one time:

library(readxl)
library(data.table)
library(lubridate)
file.list <- dir(path = "filename", pattern='\\.xlsx', full.names = T)
df.list <- lapply(file.list, read_excel)
df <- rbindlist(df.list)
df<-as.data.table(df)
df[,Time:=as.Date(Time,"%Y-%m")]  

The Time format changes to Date but the value all missing. What's going on? [SOLVED]

df$Time <- ymd( paste( df$Time, 01, sep = "-"))

then see @Erdem Akkas's answer and run the code.

There are somethimg very special.
Look at my last obs. of data: HW 2015-12 7250
When Time is 2015, the result I need is different with Time in 2016.
What I want is like following:

Type    Time    Time2   LapseMonth  PM      What I want
HW   2015-12    2015-12      0      7250    297.1311475
HW   2015-12    2016-01      1      0       614.0710382
HW   2015-12    2016-02      2      0       574.4535519
HW   2015-12    2016-03      3      0       614.0710382
HW   2015-12    2016-04      4      0       594.2622951
HW   2015-12    2016-05      5      0       614.0710382
HW   2015-12    2016-06      6      0       594.2622951
HW   2015-12    2016-07      7      0       614.0710382
HW   2015-12    2016-08      8      0       614.0710382
HW   2015-12    2016-09      9      0       594.2622951
HW   2015-12    2016-10      10     0       614.0710382
HW   2015-12    2016-11      11     0       594.2622951
HW   2015-12    2016-12      12     0       316.9398907

Above is when Time in 2015, What I want I actually need.
I know it is very complex, so I explain it in the following.
1. When Time is 2015, no matter what month, LapseMonth is not the same when Time is 2016. LapseMonth must has 0~12 when Time in 2015.(see above)
2. Column What I want is the same calculation when Time in 2016. However, there is one special case. When LapseMonth is equal to 12, the value of What I want is original value minus the value of What I want when LapseMonth=0. Here is weird.
original value in this example is 614.0710382, because Dec has 31 days. Thus, 614.0710382 = PM*31/366 = 7250*31/366.
The value of What I want when LapseMonth is equal to 12 is 614.0710382 - 297.1311475.

Upvotes: 1

Views: 101

Answers (1)

Erdem Akkas
Erdem Akkas

Reputation: 2070

With data.table:

library(data.table)
library(lubridate)

df<-read.table(text="Type    Time            PM
           Ac      2016/1/1    7710
           Ac      2016/2/1    27500
           Ac      2016/3/1    12840
           Ac      2016/4/1    20060
           Ac      2016/5/1    9430
           Ac      2016/6/1    9830
           Ac      2016/7/1    3870
           Ac      2016/8/1    12040
           Ac      2016/9/1    5590
           Ac      2016/10/1   4300
           Ac      2016/11/1   3870
           Ac      2016/12/1   2580
           As      2016/1/1    315052
           As      2016/2/1    478542
           As      2016/3/1    491472
           As      2016/4/1    662238
           As      2016/5/1    464728
           As      2016/6/1    530424
           As      2016/7/1    443122
           As      2016/8/1    603512
           As      2016/9/1    564438
           As      2016/10/1   440068
           As      2016/11/1   462846
           As      2016/12/1   525394",header=T)

dt<-as.data.table(df)
dt[,Time:=as.Date(Time,"%Y/%m/%d")]
dtapp<-dt[rep(1:.N,12-month(Time)+1)]
dtapp[,LapseMonth := seq_len(.N)-1, by =.(Type,Time,PM) ]
dtapp[,Time2:=Time-days(mday(Time)-1)+months(LapseMonth)]
dtapp[,`What I want`:=ifelse(Time==Time2,PM*15/366,PM*days_in_month(Time2)/366)]

dtapp
Type       Time     PM LapseMonth      Time2 What I want
1:   Ac 2016-01-01   7710          0 2016-01-01     315.984
2:   Ac 2016-01-01   7710          1 2016-02-01     610.902
3:   Ac 2016-01-01   7710          2 2016-03-01     653.033
4:   Ac 2016-01-01   7710          3 2016-04-01     631.967
5:   Ac 2016-01-01   7710          4 2016-05-01     653.033
---                                                         
152:   As 2016-10-01 440068          1 2016-11-01   36071.148
153:   As 2016-10-01 440068          2 2016-12-01   37273.519
154:   As 2016-11-01 462846          0 2016-11-01   18969.098
155:   As 2016-11-01 462846          1 2016-12-01   39202.803
156:   As 2016-12-01 525394          0 2016-12-01   21532.541

Upvotes: 2

Related Questions