Sss
Sss

Reputation: 447

Error when subsetting in R - empty data frame

I am subsetting a data frame using the function subset without success:

This is my data frame enter image description here

> dput(df)
structure(list(Station = c("S09489500", "S09498500", "S09510200", 
"S09494000", "S09497500", "S09492400", "S09504500", "S09503700"
), location = c("back", "back", "ahead", "ahead", "back", "ahead", 
"ahead", "ahead"), length_years = c(36L, 75L, 33L, 34L, 75L, 
35L, 49L, 34L), begin = c(1985, 1946, 1962, 1959, 1946, 1958, 
1949, 1964), end = c(2020, 2020, 1994, 1992, 2020, 1992, 1997, 
1997), Utest_Q7min = c(26.3618474823095, 119.756524166147, 12.749016687539, 
20.3410125011518, 92.9397377831962, 19.8329511433346, 18.5949830661337, 
34.4767872640756), Significance_Qtest_Q7min = c("No Independent", 
"No Independent", "No Independent", "No Independent", "No Independent", 
"No Independent", "No Independent", "No Independent"), PT_U = c(124, 
567, 98, 181, 646, 158, 94, 238), ChangePoint_Q7min = c(26L, 
19L, 9L, 17L, 29L, 23L, 30L, 16L), p_Q7min = c(0.292065629512458, 
0.0219501976437697, 0.42182506012988, 0.0155276557662876, 0.00571923921900464, 
0.0669830682326448, 1.28599300599023, 0.000449734648357696), 
    Significance_ChangePoint_Q7min = c("No significant", "Significant", 
    "No significant", "Significant", "Significant", "No significant", 
    "No significant", "Significant"), Man_Kendall = c("category 2", 
    "category 1", "category 1", "category 3", "category 1", "category 3", 
    "category 1", "category 3")), class = "data.frame", row.names = c(NA, 
-8L))

I am subsetting with the following code:

  df2 <- subset(df,df$Significance_ChangePoint_Q7min=="Significant" && df$Significance_Qtest_Q7min == "No Independent")

but I got an empty data frame as result.

Does someone know why the subsetting is not working in this case?

Upvotes: 0

Views: 118

Answers (4)

user8769986
user8769986

Reputation:

require(dplyr)
df2 <- df %>% filter(Significance_ChangePoint_Q7min == "Significant" & Significance_Qtest_Q7min == "No Independent")



> df2
    Station location length_years begin  end Utest_Q7min Significance_Qtest_Q7min PT_U ChangePoint_Q7min      p_Q7min Significance_ChangePoint_Q7min Man_Kendall
1 S09498500     back           75  1946 2020   119.75652           No Independent  567                19 0.0219501976                    Significant  category 1
2 S09494000    ahead           34  1959 1992    20.34101           No Independent  181                17 0.0155276558                    Significant  category 3
3 S09497500     back           75  1946 2020    92.93974           No Independent  646                29 0.0057192392                    Significant  category 1
4 S09503700    ahead           34  1964 1997    34.47679           No Independent  238                16 0.0004497346                    Significant  category 3

Upvotes: 0

Michael Barrowman
Michael Barrowman

Reputation: 1181

The reason this doesn't work is because of the double &&. You need to do a single & in this case.

What you are trying to do is to compare multiple values and get a vector or TRUE or FALSE values depending on whether the entries in your two variables are TRUE or FALSE. The single & does this.

The double && will only check the first value in your variables (i.e. the first row of your data frame) and returns a single TRUE or FALSE value and not a vector.

Upvotes: 0

akrun
akrun

Reputation: 887088

Using filter

library(dplyr)
df %>% 
     filter(Significance_ChangePoint_Q7min == "Significant" &
              Significance_Qtest_Q7min == "No Independent")

Upvotes: 1

Leonardo
Leonardo

Reputation: 2485

Try this:

df2 <- subset(df, df$Significance_ChangePoint_Q7min == "Significant" & 
                df$Significance_Qtest_Q7min == "No Independent")

with single &

Upvotes: 1

Related Questions