Reputation: 106
Col_A Col_B
1 Samsung_Note 10
2 Samsung_Notebook 20
3 Samsung_Tablet_Device 30
4 Note 40
Col_A Col_B
Samsung 10
Note 10
Samsung_Note 10
Samsung 20
Notebook 20
Samsung_Notebook 20
Samsung 30
Tablet 30
Device 30
Samsung_Tablet 30
Tablet_Device 30
Samsung_Device 30
Note 40
I would like to change my data as per provided expectations. Please suggest an optimized way to perform this operation.
For this particular purpose please assume x_z = z_x
Upvotes: 0
Views: 40
Reputation: 159
maybe there's an easier way, but this should work:
elements <- strsplit(df$COL_A, "_")
elementsAll <- lapply(seq_along(elements), function(i) append(elements[[i]], df$COL_A[i]))
dfTemp <- data.frame(
V1 = unlist(elementsAll),
V2 = rep(unlist(lapply(elementsAll, function(x) x[length(x)])),
unlist(lapply(elementsAll, length)))
)
dfTemp <- dfTemp[!duplicated(dfTemp),]
desiredDF <- merge(df, dfTemp, by.x = "COL_A", by.y = V2)
Where df denotes the input data frame. Make sure that COL_A is not a factor but a character!
Upvotes: 1