user3388256
user3388256

Reputation: 11

Merging two data frames with different sizes and missing values

I'm having a problem merging two data frames in R.

The first one consists of 103731 obs of 6 variables. The variable that I have to use to merge has 77111 unique values and the rest are NAs with a value of 0. The second one contains the frequency of those variables plus the frequency of the NAs so a frame of 77112 obs for 2 variables.

The resulting frame I need to get is the first one joined with the frequency for the merging variable, so a df of 103731 obs with the frequency for each value of the merging variable (so with duplicates if freq > 1 and also for each NA (or 0)).

Can anybody help me?

The result I'm getting now contains a data frame of 1 894 919 obs and I used:

tot = merge(df1, df2, by = "mergingVar", all= F, sort = F);  

Also I played a lot with 'all=' and none of the variations gave the right df.

Upvotes: 1

Views: 411

Answers (1)

Paulo E. Cardoso
Paulo E. Cardoso

Reputation: 5856

why don't you just take the frequency table of your first table?

a <- data.frame(a = c(NA, NA, 2,2,3,3,3))
data.frame(table(a, useNA = 'ifany'))

     a Freq
1    2    2
2    3    3
3 <NA>    2

or mutate from plyr

ddply(a, .(a), mutate, freq = length(a))

   a freq
1  2    2
2  2    2
3  3    3
4  3    3
5  3    3
6 NA    2
7 NA    2

Upvotes: 1

Related Questions