How to match data with two df in double loop in R?

Question

I am having trouble matching data by the following key:

The first data frame contains the columns ID and Distance. As below.

ID	Distance
A	30
B	50
C	300
D	120
E	50

The second data frame contains numbers that change in years (Year) and are different from the data included in Distance column of the first data frame. As below.

Year	30	50	120	300
2020	0.12	0.20	0.31	0.45
2021	0.14	0.23	0.33	0.46
2022	0.16	0.26	0.35	0.47
2023	0.18	0.29	0.37	0.48
2024	0.20	0.32	0.39	0.49

My question is. How do effectively assign these data in the third empty data frame, which contains different IDs characterized by different Distance and years for which the rate from Distance is varying. The third final df with examples of expected data is above. I am tryfing to use double loop but i am faild. Basically I am newbie in R and programming.

Year	A	B	C	D	E
2020		ex. 0.20
2021			ex. 0.33
2022					ex. 0,26
2023	ex. 0.18
2024				ex. 0.39

Many thanks for any suggestions and solutions.

Gregor Thomas · Accepted Answer

If all of the DF1$Distance values are present in the DF2 column names (and vice versa), then this should work:

# select DF2 columns by the DF1 distance values
result = DF2[c("Year", as.character(DF1$Distance))] 
# change the names to the ID values
names(result)[-1] = DF1$ID)

In the small dput() sample, however, there are some Distance values that are not columns in DF2. Just in case this could happen in your real data, here is a slightly more complicated solution where the matching columns are created in the result and the missing columns are added as NA values:

DF1$in_df2 = DF1$Distance %in% names(DF2)

result = DF2[c("Year", as.character(DF1$Distance[DF1$in_df2]))]
names(result)[-1] =  DF1$ID[DF1$in_df2]
result[DF1$ID[!DF1$in_df2]] = NA
result
#   Year    A    B    E  C  D
# 1 2017 0.40 0.54 0.54 NA NA
# 2 2018 0.42 0.56 0.56 NA NA
# 3 2019 0.44 0.58 0.58 NA NA
# 4 2020 0.46 0.60 0.60 NA NA
# 5 2021 0.48 0.62 0.62 NA NA

Using this sample data:

DF1 = read.table(text = 'ID     Distance
A   30
B   50
C   300
D   120
E   50', header = T)

DF2 = structure(list(Year = c(2017, 2018, 2019, 2020, 2021), `20` = c(0.26,
 0.28, 0.30, 0.32, 0.34), `30` = c(0.4, 0.42, 0.44, 0.46, 0.48), `50` = 
c(0.54, 0.56, 0.58, 0.6, 0.62), `70` = c(0.67, 0.68, 0.69, 0.70, 0.71 )), 
row.names = c(NA, 5L), class = "data.frame")

How to match data with two df in double loop in R?

Answers (1)

Related Questions