I have this data table with two columns: gender - male or female score - between 0 and 100 I want create a new table with 2 columns: male female and the lines are the score how can I do this in R?

Lilach

Reputation: 11

How to create new table in R from an existing table in R

I have this data table with two columns:

gender - male or female
score - between 0 and 100

I want create a new table with 2 columns:

male
female and the lines are the score

how can I do this in R?

enter image description here

Upvotes: 0

Answers (2)

akrun

Reputation: 887118

We can use unstack in base R

unstack(df1, score ~ gender)
  female male
1     90  100
2     98   80
3    100   75

data

df1 <- structure(list(score = c(100L, 90L, 98L, 80L, 75L, 100L), gender = c("male", 
"female", "female", "male", "male", "female")), class = "data.frame", row.names = c(NA, 
-6L))

Upvotes: 2

r2evans

Reputation: 160447

Up front: the rn column I add is to provide an "id" of sorts, where the results will have only one row per "id". There might be ways around it, but it really simplifies things, both in execution and in visualizing what is happening (when you look at the long and wide versions with rn still present). I remove the column in both answers ([,-1] and %>% select(-rn)) since it's just a transient column.

base R

dat$rn <- ave(seq_len(nrow(dat)), dat$gender, FUN = seq_along)
reshape(dat, timevar = "gender", idvar = "rn", direction = "wide", v.names = "score")[,-1]
#   score.male score.female
# 1        100           90
# 3         80           98
# 5         75          100

dplyr

library(dplyr)
library(tidyr) # pivot_wider
dat %>%
  group_by(gender) %>%
  mutate(rn = row_number()) %>%
  pivot_wider(rn, names_from = gender, values_from = score) %>%
  select(-rn)
# # A tibble: 3 x 2
#    male female
#   <int>  <int>
# 1   100     90
# 2    80     98
# 3    75    100

Data

dat <- structure(list(score = c(100L, 90L, 98L, 80L, 75L, 100L), gender = c("male", "female", "female", "male", "male", "female"), rn = c(1L, 1L, 2L, 2L, 3L, 3L)), row.names = c(NA, -6L), class = "data.frame")