SumitArya
SumitArya

Reputation: 121

How to look for uniques in other column relatively assign ids

I have a toy example to explain what I am trying to work on :

aski = data.frame(x=c("a","b","c","a","d","d"),y=c("b","a","d","a","b","c"))

I managed to do assigning unique ids to column y and now output looks like:

aski2 = data.frame(x=c("a","b","c","a","d","d"),y=c("1","2","3","2","1","4"))

as you see "b" is present in both col x and y and we assigned an id=1 in col y and "a" with id=2 in col y and so on.. As you see these values are also present in col x..... col x has "a" as its first element ."a" was also in col y and assigned an id=2 so I'll assign an id=2 for a in col x also Now what i m trying to do next is look for these values in col x and if it occurs in col y I assign that id to it

FINAL DATAFRAME LIKE

aski3 = data.frame(x=c("2","1","4","2","3","3"),y=c("1","2","3","2","1","4"))

Upvotes: 3

Views: 74

Answers (3)

Jaap
Jaap

Reputation: 83215

Without the need to create aski2 as an intermediate, a possible solution is to use match with lapply to get the numeric representations of the letters:

# create a vector of the unique values in the order
# in which you want them assigned to '1' till '4'
v <- unique(aski$y) 

# convert both columns to integer values with 'match' and 'lapply'
aski[] <- lapply(aski, match, v)

which gives:

> aski
  x y
1 2 1
2 1 2
3 4 3
4 2 2
5 3 1
6 3 4

If you want the number as characters, you can additionally do:

aski[] <- lapply(aski, as.character)

Upvotes: 2

www
www

Reputation: 39154

A solution from dplyr. We can first create a vector showing the relationship between index and letter as vec by unique(aski$y). After this step, you can use Jaap's lapply solution, or you can use mutata_all from dplyr as follows.

# Create the vector showing the relationship of index and letter 
vec <- unique(aski$y)
# View vec
vec
[1] "b" "a" "d" "c"

library(dplyr)

# Modify all columns
aski2 <- aski %>% mutate_all(funs(match(., vec)))
# View the results
aski2
  x y
1 2 1
2 1 2
3 4 3
4 2 2
5 3 1
6 3 4

Data

aski <- data.frame(x = c("a","b","c","a","d","d"),
                   y = c("b","a","d","a","b","c"),
                   stringsAsFactors = FALSE)

Upvotes: 1

Kota Mori
Kota Mori

Reputation: 6740

First, convert both columns to character vectors. Then, collect all unique values from the two columns to use as levels of a factor.

Convert both columns to factors, then numeric.

aski = data.frame(x=c("a","b","c","a","d","d"),y=c("b","a","d","a","b","c"))

aski$x <- as.character(aski$x)
aski$y <- as.character(aski$y)

lev <- unique(c(aski$y, aski$x))
aski$x <- factor(aski$x, levels=lev)
aski$y <- factor(aski$y, levels=lev)

aski$x <- as.numeric(aski$x)
aski$y <- as.numeric(aski$y)
aski

Upvotes: 1

Related Questions