PrzeM
PrzeM

Reputation: 221

Data in several rows into one row and columns

I have a table, where data is stored in a following pattern:

ID Account
1  A
1  B
2  B
3  A
4  A
4  B

I would like to get one row per ID in a following way:

ID Account_A Account_B
1  1         1
2  0         1
3  1         0
4  1         1

0s and 1s don't matter that much, can be As, Bs etc. The main problem is to get data into one row per ID so I can merge it with another table.

I am not too familiar with R, so simpler solutions without too many new libraries are preferred.

Upvotes: 1

Views: 938

Answers (2)

zx8754
zx8754

Reputation: 56239

We need table(), then prettify to match the expected output.

Note: I am using new example data to match OP's request regarding duplicated rows.

PrzeM commented: One more question - is it possible to combine this code with counting number of occurences? In another data frame I have a similar layout, but for given example Account "A" can occur multiple times for same ID, then I would like to get a similar summary, but with a result similar to using COUNT() function in Excel.

# based on OP comments new example data
df1 <- read.table(text = "ID Account
1  A
1  A
9  B
9  B
3  A
4  A
4  B", header = TRUE)

# get table and convert to a dataframe
myTable <- table(df1)
res <- as.data.frame.matrix(myTable)
res
#   A B
# 1 2 0
# 3 1 0
# 4 1 1
# 9 0 2

# update column names for Accounts
colnames(res) <- paste("Account", colnames(res), sep = "_")

# ID is in rownames, add it back as new column
res$ID <- rownames(res)

# reorder columns to get desired output
res <- res[, c(3, 1, 2)]
rownames(res) <- NULL
res
#   ID Account_A Account_B
# 1  1         2         0
# 2  3         1         0
# 3  4         1         1
# 4  9         0         2

Upvotes: 1

Sven Hohenstein
Sven Hohenstein

Reputation: 81733

Here is a solution with the base R function reshape.

The data:

dat <- data.frame(ID = c(1, 1, 2, 3, 4, 4), Account = c("A", "B", "B", "A", "A", "B"))

Transform the data to the wide format with reshape.

dat_wide <- reshape(dat, direction = "wide", v.names = "Account", idvar = "ID",
                    timevar = "Account", sep = "_")

dat_wide
#     ID Account_A Account_B
#   1  1         A         B
#   3  2      <NA>         B
#   4  3         A      <NA>
#   5  4         A         B

The values can be replaced with 1 and 0:

dat_wide[-1] <- as.integer(!is.na(dat_wide[-1]))
#   ID Account_A Account_B
# 1  1         1         1
# 3  2         0         1
# 4  3         1         0
# 5  4         1         1

Upvotes: 2

Related Questions