Reputation: 356
I have been trying to summarise a set of data using dplyr
in R. This is the code I have been using and it had been working fine up until recently.
library(tidverse);library(curl)
data<-read.csv(curl("https://raw.githubusercontent.com/megaraptor1/mydata/main/data.csv"))
data2<-data %>%
group_by(e.taxon) %>%
summarise(across(c(e.hbl,e.bm), weighted.mean, e.N),
N = sum(e.N))
"Error: Problem with summarise()
input ..1
.
x 'x' and 'w' must have the same length
i Input ..1
is (function (.cols = everything(), .fns = NULL, ..., .names = NULL) ...
.
i The error occurred in group 2: e.taxon = "Abrocoma_bennettii"."
Now I know the purported reason for this error: two of the columns don't have the same length or have missing values. However, when I check to see which of the columns is producing the error, it says that all of the variables have the same number of entries (i.e., no missing data).
length(data$e.taxon)
length(data$e.hbl)
length(data$e.bm)
length(data$e.N)
I tried searching for this error message to see if there is more information behind it that I could use, but I could not find anything. What's really strange is this code was working fine until some unknown change, and due to the way the file is set up I cannot easily identify where the new changes are that might have produced this (the example is part of a larger shared dataset). I am trying to figure out why R is returning this error when all of the data have complete cases.
Upvotes: 0
Views: 1787
Reputation: 887118
It does work with the new version of dplyr
(1.0.6
tested on R 4.1.0
)
library(dplyr)
data %>%
group_by(e.taxon) %>%
summarise(across(c(e.hbl,e.bm), weighted.mean, e.N), N = sum(e.N))
# A tibble: 2,004 x 4
e.taxon e.hbl e.bm N
<chr> <dbl> <dbl> <int>
1 Abrawayomys_ruschii 126. 54.7 3
2 Abrocoma_bennettii 190. 200 9
3 Abrocoma_cinerea 149. 86.3 5
4 Abrothrix_andinus 83.7 16.7 34
5 Abrothrix_illuteus 121. 42 11
6 Abrothrix_longipilis 105. 32.3 62
7 Abrothrix_olivaceus 87.0 19.4 45
8 Acinonyx_jubatus 1278. 52163. 7
9 Acomys_cahirinus 105 41.1 2
10 Acomys_sp. 98.5 67 2
# … with 1,994 more rows
As we are passing arguments instead of a lambda function, it may be better to use the name i.e. w = e.N
(though it wouldn't matter here as the second argument is w
)
Upvotes: 2