Reputation: 143
Is there a function that can invert the number of occurrences of a value in a data.table, as opposed to sorting by frequency? E.g. say I have this:
install.packages('data.table')
require(data.table)
initially = data.table(initially = c('a,a','b,b','b,b','c,c','c,c','c,c'))
View(initially)
And wish to produce this:
required.inversion = data.table(required.inversion = c('a,a','a,a','a,a','b,b','b,b', 'c,c'))
View(required.inversion)
The way I was thinking of doing this was to produce a frequency table:
initial.frequencies = initially[, .N ,by = initially]
View(initial.frequencies)
Sort it to ensure it's in ascending frequency order:
initial.frequencies = initial.frequencies[,.SD[order(N)]]
View(initial.frequencies)
Store the order of those initial values:
inversion.key = initial.frequencies$initially
View(inversion.key)
Re-sort the data.table so it's in descending frequency order:
initial.frequencies = initial.frequencies[,.SD[order(N, decreasing = TRUE)]]
View(initial.frequencies)
Then insert the original order back into the table:
initial.frequencies$inversion.key = inversion.key
View(initial.frequencies)
I now have a 'key' showing me how many times an initial value would need to be multiplied to invert the number of times it occurs. I.e. that I'd need to multiply the number of times 'a,a' occurs by three, 'b,b' by two and 'c,c' by one.
I'm not sure how to actually replicate the values in the original table and this seems like a bad approach to take as it'll also double the length of the table.
this.approach.would.yield.this.in.the.ram = data.table(this.approach.would.yield.this.in.the.ram = c('a,a','b,b','b,b','c,c','c,c','c,c', 'a,a','a,a','a,a','b,b','b,b', 'c,c'))
View(this.approach.would.yield.this.in.the.ram)
Upvotes: 3
Views: 164
Reputation: 886938
If we use the approach by the OP, then just replicate the rows by the reverse of 'N' and assign 'N' to NULL
initially[, .N, by = initially][rep(seq_len(.N), rev(N))][, N := NULL][]
Upvotes: 2