Reputation: 43
I am trying to run the following code:
X <- data.frame(X)
Y <- data.frame(Y)
library(dplyr)
df_v1 <- X %>%
dplyr::count(v1)
vector_v1 <- df_v1[,2]
vector_v2 <- Y$v2
Result <- rep(vector_v2, each=vector_v1)
print(Result)
I have a total of 1.048.575 observations in v1
and by using dplyr::count(v1)
, I am trying to see how many of them (these are integers ordered in ascending number, from 1 to 10.571) are repeated. The output that I get is n: 45, 68, 37, 41 ...
.
Then, by using rep(vector_v2, each=vector_v1)
, I am trying to match (or, rather, to fill in) 10571 observations that I have in v2
with the amount of repeated observations that I have in v1
, and store them in the variable Result
.
The output that I get is, as follows:
first element used of 'each' argument [1] 1716 1716 1716 1716 1716 1716 1716 1716 1716 1716 1716 1716 1716 1716 1716 1716 1716
[18] 1716 1716 1716 1716 1716 1716 1716 1716 1716 1716 1716 1716 1716 1716 1716 1716 1716
[35] 1716 1716 1716 1716 1716 1716 1716 1716 1716 1716 1716 2184 2184 2184 2184 2184 2184
[52] 2184 2184 2184 2184 2184 2184 2184 2184 2184 2184 2184 2184 2184 2184 2184 2184 2184
[69] 2184 2184 2184 2184 2184 2184 2184 2184 2184 2184 2184 2184 2184 2184 2184 2184 2184
[86] 2184 2184 2184 2184 2184 558 558 558 558 558 558 558 558 558 558 558 558
[103] 558 558 558 558 558 558 558 558 558 558 558 558 558 558 558 558 558
[120] 558 558 558 558 558 558 558 558 558 558 558 558 558 558 558 558 2254
[137] 2254 2254 2254 2254 2254 2254 2254 2254 2254 2254 2254 2254 2254 2254 2254 2254 2254
[154] 2254 2254 2254 2254 2254 2254 2254 2254 2254 2254 2254 2254 2254 2254 2254 2254 2254
[171] 2254 2254 2254 2254 2254 2254 2254 2254 2254 2254 4719 4719 4719 4719 4719 4719 4719
[188] 4719 4719 4719 4719 4719 4719 4719 4719 4719 4719 4719 4719 4719 4719 4719 4719 4719
[205] 4719 4719 4719 4719 4719 4719 4719 4719 4719 4719 4719 4719 4719 4719 4719 4719 4719
which is all fine by me, with the exception of the fact that I obtain a total of 475.695, instead of the desired 1.048.575 observations.
Does anyone know what I am doing wrong here?
Also, I would ideally like to have the output in the form of one single column instead of a matrix. Any idea on how this could be achieved?
Any help would be greatly appreciated.
Many thanks in advance!
Upvotes: 1
Views: 70
Reputation: 39697
I don't get how the output should look like, so make a simple example showing the output.
v1 <- 4:6
v2 <- 1:3
rep(v1, v2)
#[1] 4 5 5 6 6 6
rep_len(v1, sum(v2))
#[1] 4 5 6 4 5 6
#gives a WARNING, that only the first element of v2 is used
rep(v1, each=v2)
#[1] 4 5 6
Upvotes: 1