Reputation: 59
I have the following data.
name x1 x2 x3 x4
1 V1_3 1 0 999 999
2 V2_3 1.12 0.044 25.4 0
3 V3_3 0.917 0.045 20.4 0
4 V1_15 1 0 999 999
5 V2_15 1.07 0.036 29.8 0
6 V3_15 0.867 0.039 22.5 0
7 V1_25 1 0 999 999
8 V2_25 1.07 0.034 31.1 0
9 V3_25 0.917 0.037 24.6 0
10 V1_35 1 0 999 999
11 V2_35 1.05 0.034 31.2 0
12 V3_35 0.994 0.037 26.6 0
13 V1_47 1 0 999 999
14 V2_47 1.03 0.031 33.6 0
15 V3_47 0.937 0.034 27.4 0
16 V1_57 1 0 999 999
17 V2_57 1.13 0.036 31.9 0
18 V3_57 1.03 0.037 28.1 0
I want to convert this data to the following data. Can someone give me some suggestion, please?
name est_3 est_15 est_25 est_35 est_47 est_57
1 V2 1.12 1.07 1.07 1.05 1.03 1.13
2 V3 0.917 0.867 0.917 0.994 0.937 1.03
Upvotes: 0
Views: 87
Reputation: 23574
Here is one approach for you. Your data is called mydf
here. First, you want to choose necessary columns (i.e., name
and x1
) using select()
. Then, you want to subset rows using filter()
. You want to grab rows that begin with V2
or V3
in strings. grepl()
checks if each string has the pattern. Then, you want to split the column, name
and create two columns (i.e., name
and est
). Finally, you want to convert the data to a long-format data using pivot_wider()
.
library(dplyr)
library(tidyr)
select(mydf, name:x1) %>%
filter(grepl(x = name, pattern = "^V[2|3]")) %>%
separate(col = name, into = c("name", "est"), sep = "_") %>%
pivot_wider(names_from = "est",values_from = "x1", names_prefix = "est_")
# name est_3 est_15 est_25 est_35 est_47 est_57
# <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#1 V2 1.12 1.07 1.07 1.05 1.03 1.13
#2 V3 0.917 0.867 0.917 0.994 0.937 1.03
For your reference, when you ask questions, you want to provide a minimal sample data and code. If you can do that, SO users can easily help you out. Please read this question.
DATA
mydf <- structure(list(name = c("V1_3", "V2_3", "V3_3", "V1_15", "V2_15",
"V3_15", "V1_25", "V2_25", "V3_25", "V1_35", "V2_35", "V3_35",
"V1_47", "V2_47", "V3_47", "V1_57", "V2_57", "V3_57"), x1 = c(1,
1.122, 0.917, 1, 1.069, 0.867, 1, 1.066, 0.917, 1, 1.048, 0.994,
1, 1.03, 0.937, 1, 1.133, 1.032), x2 = c(0, 0.044, 0.045, 0,
0.036, 0.039, 0, 0.034, 0.037, 0, 0.034, 0.037, 0, 0.031, 0.034,
0, 0.036, 0.037), x3 = c(999, 25.446, 20.385, 999, 29.751, 22.478,
999, 31.134, 24.565, 999, 31.18, 26.587, 999, 33.637, 27.405,
999, 31.883, 28.081), x4 = c(999, 0, 0, 999, 0, 0, 999, 0, 0,
999, 0, 0, 999, 0, 0, 999, 0, 0)), row.names = c(NA, -18L), class = c("tbl_df",
"tbl", "data.frame"))
Upvotes: 1