RanonKahn
RanonKahn

Reputation: 862

How to make a column values unique based on values in other column(s) in R?

I have a large data set similar to as shown in the example below.

df <- structure(list(FCN = structure(c(1L, 1L, 1L, 2L, 2L, 3L, 3L, 
3L), .Label = c("010.X91116.3D3.A8", "010.X91116.6B7.F9", "010.X91116.6C6.C12"
), class = "factor"), DOM = structure(c(1L, 2L, 2L, 1L, 2L, 1L, 
2L, 2L), .Label = c("VH", "VK"), class = "factor"), FN = structure(c(1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L), .Label = "OM", class = "factor"), 
    RV = c(49257.4, 23571.2, 24115.6, 49351.4, 24102.6, 49641.8, 
    23226.2, 23408.2)), .Names = c("FCN", "DOM", "FN", "RV"), class = "data.frame", row.names = c(NA, 
-8L))

I am trying to create a table by making FN column values unique with a suffix based on the values in FCN, RV and DOM. I am able to use a for loop and process the data as shown in the figure. However, it takes time when processing thousands of rows.

At the end I want to pivot the data making the values in FN as columns and values in RV. Please guide me how to achieve the results elegantly using some library function if possible.

library(reshape2)
pivot_df <- dcast(df, FCN + DOM ~ FN)

Upvotes: 1

Views: 66

Answers (2)

RanonKahn
RanonKahn

Reputation: 862

Adopting @akrun's advice:

library(reshape2)
df <- structure(list(FCN = structure(c(1L, 1L, 1L, 2L, 2L, 3L, 3L,  3L), .Label = c("010.X91116.3D3.A8", "010.X91116.6B7.F9", "010.X91116.6C6.C12"), class = "factor"), DOM = structure(c(1L, 2L, 2L, 1L, 2L, 1L, 2L, 2L), .Label = c("VH", "VK"), class = "factor"), FN = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L), .Label = "OM", class = "factor"), RV = c(49257.4, 23571.2, 24115.6, 49351.4, 24102.6, 49641.8, 23226.2, 23408.2)), .Names = c("FCN", "DOM", "FN", "RV"), class = "data.frame", row.names = c(NA, -8L))
df$FN <- with(df, paste0(FN, ave(seq_along(FN), FCN, DOM, FUN = seq_along)))
pivot_df <- dcast(df, FCN + DOM ~ FN)

Upvotes: 1

akrun
akrun

Reputation: 887078

We can use ave to do this

df$FN <- with(df, paste0(FN, ave(seq_along(FN), FCN, DOM, FUN = seq_along)))

If we need to reshape to 'wide', then rowid from data.table can be used with dcast

library(data.table)
dcast(setDT(df), FCN + DOM ~FN + rowid(DOM), value.var = "RV")

Upvotes: 2

Related Questions