Reputation: 1873
It is straight forward to obtain unique values of a column using unique
. However, I am looking to do the same but for multiple columns in a dataframe and store them in a list, all using base R. Importantly, it is not combinations I need but simply unique values for each individual column. I currently have the below:
# dummy data
df = data.frame(a = LETTERS[1:4]
,b = 1:4)
# for loop
cols = names(df)
unique_values_by_col = list()
for (i in cols)
{
x = unique(i)
unique_values_by_col[[i]] = x
}
The problem comes when displaying unique_values_by_col
as it shows as empty. I believe the problem is i
is being passed to the loop as a text not a variable.
Any help would be greatly appreciated. Thank you.
Upvotes: 3
Views: 1095
Reputation: 1067
Or you have also apply
that is specifically done to be run on column or line:
apply(df,2,unique)
result:
> apply(df,2,unique)
a b
[1,] "A" "1"
[2,] "B" "2"
[3,] "C" "3"
[4,] "D" "4"
thought if you want a list lapply
return you a list so may be better
Upvotes: 2
Reputation: 13319
Could this be what you're trying to do?
Map(unique,df)
Result:
$a
[1] A B C D
Levels: A B C D
$b
[1] 1 2 3 4
Upvotes: 1
Reputation: 33603
Your for
loop is almost right, just needs one fix to work:
# for loop
cols = names(df)
unique_values_by_col = list()
for (i in cols) {
x = unique(df[[i]])
unique_values_by_col[[i]] = x
}
unique_values_by_col
# $a
# [1] A B C D
# Levels: A B C D
#
# $b
# [1] 1 2 3 4
i
is just a character, the name of a column within df
so unique(i)
doesn't make sense.
Anyhow, the most standard way for this task is lapply()
as shown by demirev.
Upvotes: 1
Reputation: 195
Why not avoid the for
loop altogether using lapply:
lapply(df, unique)
Resulting in:
> $a
> [1] A B C D
> Levels: A B C D
> $b
> [1] 1 2 3 4
Upvotes: 2