Creation of timeseries depending on value of vector of df1 with timeseries of column of xts2 in R

Question

I've got two datasets. I want to create a list of new timeseries depending on different character given in df1$col5. Therefore I have another xts 2 , which columnname (header) fits to the character of df1$col5. Each column of xts 2 includes a timeseries. The name of each ts for the list is given in df1/col1. In Addition to that I have to edit each new timeseries with a value from df1/col3.

It sounds rather chaotic, but it's the best way to discribe my problem.

How do I match a character of df1/col5 with one header of xts2?

I have searched quite a while but didn't find any solution.

df1 Looks like this:

        MKZ Elbekilometer MNW91-2010       Fix   Pegel Distanz_Pegel_m
1  43420072         178.2      70.70  70.37300  Torgau      18816.2011
2  43435074         172.0      72.09  71.81900  Torgau      11926.9932
3  43435084         177.0      70.93  70.61100  Torgau      15620.9210
4  43435086         171.8      72.14  71.87100  Torgau      11478.2310
5  44425470         172.2      72.05  71.76767  Torgau      17400.7983
6  44425476         172.2      72.05  71.76767  Torgau      14448.9073
7  45426142         154.0      75.91  75.72300  Torgau      21065.9449
8  45440655         125.4      84.18  83.94567   Riesa      17019.6066
9  46431146         117.0      87.17  87.04500   Riesa      23594.2523
10 46440130         116.0      87.45  87.32600   Riesa      10471.5078

MKZ shall be the name of each new timeseries in my list. The dependig character is given in Pegel, which fits to the columnname of xts2 which Looks like this:

            Schöna   Pirna Dresden  Meißen  Riesa Torgau
1980-01-01 119.583 110.576 105.742  98.522 92.036 79.038
1980-01-02 119.523 111.426 105.652  98.412 91.926 78.908
1980-01-03 119.413 111.316 105.592  98.362 91.876 78.848
1980-01-04 119.123 111.126 105.382  98.222 91.756 78.748
1980-01-05 119.103 110.956 105.282  98.032 91.536 78.488
1980-01-06 118.823 110.786 105.062  97.802 91.396 78.348
1980-01-07 118.783 110.726 104.972  97.722 91.276 78.128
1980-01-08 118.923 110.866 105.102  97.852 91.336 78.088

In the end it shall look like this:

data<-list()

data[[1]]
file[1] <- 43420072

           Torgau   newcol
1980-01-01  79.038  79.038+df1/col3
1980-01-02  78.908  78.908+df1/col3
1980-01-03  78.848  78.848+df1/col3
1980-01-04  78.748  78.748+df1/col3
1980-01-05  78.488  78.488+df1/col3
1980-01-06  78.348  78.348+df1/col3
1980-01-07  78.128  78.128+df1/col3
1980-01-08  78.088  78.088+df1/col3

UPDATE

Basically the answer works fine exept for all timeseries based on Pegel Schöna. My Code Looks like this now:

library(dplyr)
library(tidyr)
library(tibble)

pegeluntereinander <- Pegelganglinien %>% 
  rownames_to_column('date') %>% 
  gather('Pegel','value', -date)

zus <- pegeluntereinander %>%
  left_join(Werte) %>% #WITHOUT running df1 %>% filter(MKZ == 43420072)
  select(date, Pegel, value, MKZ, `MNW91.2010`) %>% #Added MKZ as column for clarity
  mutate(new.col = value + `MNW91.2010`)

files<-Werte$MKZ
workspace<-...
for (i in 1:length(files)){
  filename<-Werte[i,][1,1]
  zeilen<-which(zus$MKZ==filename)
  mkz<-new.df[zeilen, ]
  filename<-as.character(filename)
  dateinamen_csv=paste(filename, "csv", sep = ".")
  speicherpfad_inkl_namen=paste(workspace,dateinamen_csv, sep = "/")
  write.csv(mkz, file = speicherpfad_inkl_namen, row.names = F)
}

to undestand the names of the dataframe:

Werte=df1
Pegelganglinien=df2
pegeluntereinander=df2.new
zus=df.new

CodeNoob · Accepted Answer

1. Load your data

df1 <- read.table(text = '        MKZ Elbekilometer MNW91-2010       Fix   Pegel Distanz_Pegel_m
1  43420072         178.2      70.70  70.37300  Torgau      18816.2011
2  43435074         172.0      72.09  71.81900  Torgau      11926.9932
3  43435084         177.0      70.93  70.61100  Torgau      15620.9210
4  43435086         171.8      72.14  71.87100  Torgau      11478.2310
5  44425470         172.2      72.05  71.76767  Torgau      17400.7983
6  44425476         172.2      72.05  71.76767  Torgau      14448.9073
7  45426142         154.0      75.91  75.72300  Torgau      21065.9449
8  45440655         125.4      84.18  83.94567   Riesa      17019.6066
9  46431146         117.0      87.17  87.04500   Riesa      23594.2523
10 46440130         116.0      87.45  87.32600   Riesa      10471.5078')

df2 <- read.table(text = '            Schöna   Pirna Dresden  Meißen  Riesa Torgau
  1980-01-01 119.583 110.576 105.742  98.522 92.036 79.038
1980-01-02 119.523 111.426 105.652  98.412 91.926 78.908
1980-01-03 119.413 111.316 105.592  98.362 91.876 78.848
1980-01-04 119.123 111.126 105.382  98.222 91.756 78.748
1980-01-05 119.103 110.956 105.282  98.032 91.536 78.488
1980-01-06 118.823 110.786 105.062  97.802 91.396 78.348
1980-01-07 118.783 110.726 104.972  97.722 91.276 78.128
1980-01-08 118.923 110.866 105.102  97.852 91.336 78.088')

We first format the second dataframe such that we can easily join it using left_join from dplyr. Then we filter the first dataframe on the MKZ you gave (43420072) and combine the data:

2. Format the second dataframe (df2) such that we can easily join it

library(dplyr)
library(tidyr)
library(tibble)

df2.new <- df2 %>% 
rownames_to_column('date') %>% 
gather('Pegel','value', -date)

3. Build the new dataframe

# We pick the correct data from the first dataframe
df1 <- df1 %>% filter(MKZ == 43420072)

# Build the new df
new.df <- df2.new %>%
  left_join(df1) %>%
  select(date, Pegel, value, `MNW91.2010`) %>%
  mutate(new.col = value + `MNW91.2010`)

Output
This new dataframe will look how you described (filtered for Plegel == Torgau):

       date  Pegel  value MNW91.2010 new.col
1 1980-01-01 Torgau 79.038       70.7 149.738
2 1980-01-02 Torgau 78.908       70.7 149.608
3 1980-01-03 Torgau 78.848       70.7 149.548
4 1980-01-04 Torgau 78.748       70.7 149.448
5 1980-01-05 Torgau 78.488       70.7 149.188
6 1980-01-06 Torgau 78.348       70.7 149.048
7 1980-01-07 Torgau 78.128       70.7 148.828
8 1980-01-08 Torgau 78.088       70.7 148.788

EDIT
Output without filtering for MKZ

new.df <- df2.new %>%
  left_join(df1) %>% #WITHOUT running df1 %>% filter(MKZ == 43420072)
  select(date, Pegel, value, MKZ, `MNW91.2010`) %>% #Added MKZ as column for clarity
  mutate(new.col = value + `MNW91.2010`)

This produces the following data frame for Torgau:

         date  Pegel  value      MKZ MNW91.2010 new.col
1  1980-01-01 Torgau 79.038 43420072      70.70 149.738
2  1980-01-01 Torgau 79.038 43435074      72.09 151.128
3  1980-01-01 Torgau 79.038 43435084      70.93 149.968
4  1980-01-01 Torgau 79.038 43435086      72.14 151.178
5  1980-01-01 Torgau 79.038 44425470      72.05 151.088
6  1980-01-01 Torgau 79.038 44425476      72.05 151.088
7  1980-01-01 Torgau 79.038 45426142      75.91 154.948
8  1980-01-02 Torgau 78.908 43420072      70.70 149.608
9  1980-01-02 Torgau 78.908 43435074      72.09 150.998
10 1980-01-02 Torgau 78.908 43435084      70.93 149.838
11 1980-01-02 Torgau 78.908 43435086      72.14 151.048
12 1980-01-02 Torgau 78.908 44425470      72.05 150.958
13 1980-01-02 Torgau 78.908 44425476      72.05 150.958
14 1980-01-02 Torgau 78.908 45426142      75.91 154.818
15 1980-01-03 Torgau 78.848 43420072      70.70 149.548
16 1980-01-03 Torgau 78.848 43435074      72.09 150.938
17 1980-01-03 Torgau 78.848 43435084      70.93 149.778
18 1980-01-03 Torgau 78.848 43435086      72.14 150.988
19 1980-01-03 Torgau 78.848 44425470      72.05 150.898
20 1980-01-03 Torgau 78.848 44425476      72.05 150.898
21 1980-01-03 Torgau 78.848 45426142      75.91 154.758
22 1980-01-04 Torgau 78.748 43420072      70.70 149.448
23 1980-01-04 Torgau 78.748 43435074      72.09 150.838
24 1980-01-04 Torgau 78.748 43435084      70.93 149.678
25 1980-01-04 Torgau 78.748 43435086      72.14 150.888
26 1980-01-04 Torgau 78.748 44425470      72.05 150.798
27 1980-01-04 Torgau 78.748 44425476      72.05 150.798
28 1980-01-04 Torgau 78.748 45426142      75.91 154.658
29 1980-01-05 Torgau 78.488 43420072      70.70 149.188
30 1980-01-05 Torgau 78.488 43435074      72.09 150.578
31 1980-01-05 Torgau 78.488 43435084      70.93 149.418
32 1980-01-05 Torgau 78.488 43435086      72.14 150.628
33 1980-01-05 Torgau 78.488 44425470      72.05 150.538
34 1980-01-05 Torgau 78.488 44425476      72.05 150.538
35 1980-01-05 Torgau 78.488 45426142      75.91 154.398
36 1980-01-06 Torgau 78.348 43420072      70.70 149.048
37 1980-01-06 Torgau 78.348 43435074      72.09 150.438
38 1980-01-06 Torgau 78.348 43435084      70.93 149.278
39 1980-01-06 Torgau 78.348 43435086      72.14 150.488
40 1980-01-06 Torgau 78.348 44425470      72.05 150.398
41 1980-01-06 Torgau 78.348 44425476      72.05 150.398
42 1980-01-06 Torgau 78.348 45426142      75.91 154.258
43 1980-01-07 Torgau 78.128 43420072      70.70 148.828
44 1980-01-07 Torgau 78.128 43435074      72.09 150.218
45 1980-01-07 Torgau 78.128 43435084      70.93 149.058
46 1980-01-07 Torgau 78.128 43435086      72.14 150.268
47 1980-01-07 Torgau 78.128 44425470      72.05 150.178
48 1980-01-07 Torgau 78.128 44425476      72.05 150.178
49 1980-01-07 Torgau 78.128 45426142      75.91 154.038
50 1980-01-08 Torgau 78.088 43420072      70.70 148.788
51 1980-01-08 Torgau 78.088 43435074      72.09 150.178
52 1980-01-08 Torgau 78.088 43435084      70.93 149.018
53 1980-01-08 Torgau 78.088 43435086      72.14 150.228
54 1980-01-08 Torgau 78.088 44425470      72.05 150.138
55 1980-01-08 Torgau 78.088 44425476      72.05 150.138
56 1980-01-08 Torgau 78.088 45426142      75.91 153.998

Edit to write the data to files

I gues the writing can be done simpler like so:

write.mkz  = function(df, workspace) {
  write.csv(df,paste0(workspace,'/',unique(df$MKZ),".csv"), row.names = F)
  return(df)
}

workspace = ...
new.df %>% 
  group_by(MKZ) %>% 
  do(write.mkz(., workspace))

Creation of timeseries depending on value of vector of df1 with timeseries of column of xts2 in R

Answers (1)

Related Questions