Reputation: 4941
My data looks like (example)
ID Col1 Col2
1232 ABCSD abd
2342 ABCSD esw
7643 ABCSD rty
9821 ETHS fvc
I have 2845428
such rows. I want to find out how correlated each pair in Col1
and Col2
is. For example
ABCSD abd 0.64
ETHS fvc 0.23
How can I go about it using R? Thanks
Upvotes: 1
Views: 992
Reputation: 893
I assume that by correlation you mean something like "what portion of the ABCSD observations have abd in Col2..."
If your data are in a dataframe named df,
#get the absolute frequency
freqs <- ftable(df[,2:3])
#convert to relative frequency
freqs <- freqs/rowSums(freqs)
#then to get the format you want
library(reshape)
freqs <- melt(freqs)
Upvotes: 1