ArtUr693
ArtUr693

Reputation: 43

rep(x, each=y) gives incomplete output in R

I am trying to run the following code:

X <- data.frame(X)
Y <- data.frame(Y)

library(dplyr)

df_v1 <- X %>%
  dplyr::count(v1)

vector_v1 <- df_v1[,2]

vector_v2 <- Y$v2

Result <- rep(vector_v2, each=vector_v1)

print(Result)

I have a total of 1.048.575 observations in v1 and by using dplyr::count(v1), I am trying to see how many of them (these are integers ordered in ascending number, from 1 to 10.571) are repeated. The output that I get is n: 45, 68, 37, 41 ... .

Then, by using rep(vector_v2, each=vector_v1), I am trying to match (or, rather, to fill in) 10571 observations that I have in v2 with the amount of repeated observations that I have in v1, and store them in the variable Result.

The output that I get is, as follows:

first element used of 'each' argument   [1] 1716 1716 1716 1716 1716 1716 1716 1716 1716 1716 1716 1716 1716 1716 1716 1716 1716
  [18] 1716 1716 1716 1716 1716 1716 1716 1716 1716 1716 1716 1716 1716 1716 1716 1716 1716
  [35] 1716 1716 1716 1716 1716 1716 1716 1716 1716 1716 1716 2184 2184 2184 2184 2184 2184
  [52] 2184 2184 2184 2184 2184 2184 2184 2184 2184 2184 2184 2184 2184 2184 2184 2184 2184
  [69] 2184 2184 2184 2184 2184 2184 2184 2184 2184 2184 2184 2184 2184 2184 2184 2184 2184
  [86] 2184 2184 2184 2184 2184  558  558  558  558  558  558  558  558  558  558  558  558
 [103]  558  558  558  558  558  558  558  558  558  558  558  558  558  558  558  558  558
 [120]  558  558  558  558  558  558  558  558  558  558  558  558  558  558  558  558 2254
 [137] 2254 2254 2254 2254 2254 2254 2254 2254 2254 2254 2254 2254 2254 2254 2254 2254 2254
 [154] 2254 2254 2254 2254 2254 2254 2254 2254 2254 2254 2254 2254 2254 2254 2254 2254 2254
 [171] 2254 2254 2254 2254 2254 2254 2254 2254 2254 2254 4719 4719 4719 4719 4719 4719 4719
 [188] 4719 4719 4719 4719 4719 4719 4719 4719 4719 4719 4719 4719 4719 4719 4719 4719 4719
 [205] 4719 4719 4719 4719 4719 4719 4719 4719 4719 4719 4719 4719 4719 4719 4719 4719 4719

which is all fine by me, with the exception of the fact that I obtain a total of 475.695, instead of the desired 1.048.575 observations.

Does anyone know what I am doing wrong here?

Also, I would ideally like to have the output in the form of one single column instead of a matrix. Any idea on how this could be achieved?

Any help would be greatly appreciated.

Many thanks in advance!

Upvotes: 1

Views: 70

Answers (1)

GKi
GKi

Reputation: 39697

I don't get how the output should look like, so make a simple example showing the output.

v1 <- 4:6
v2 <- 1:3

rep(v1, v2)
#[1] 4 5 5 6 6 6

rep_len(v1, sum(v2))
#[1] 4 5 6 4 5 6

#gives a WARNING, that only the first element of v2 is used
rep(v1, each=v2)
#[1] 4 5 6

Upvotes: 1

Related Questions