How to exclude lines from a dataframe ( with condition )?

Question

Here is my dataframe

dput(head(df, 20))
structure(list(Varietes = c("EPS45", "EPS45", "EPS45", "EPS45", "EPS45", "EPS45","EPS45", "EPS45", "EPS45", "EPS45", "EPS45", "EPS45", "EPS45", "EPS45", "EPS45", "EPS45", "EPS45", "EPS45", "EPS45", "EPS45"), Varietes.rep = c("EPS45_2", "EPS45_2", "EPS45_2", "EPS45_2", "EPS45_2", "EPS45_2", "EPS45_2", "EPS45_2", "EPS45_2", "EPS45_2","EPS45_2", "EPS45_2", "EPS45_2", "EPS45_2", "EPS45_2", "EPS45_2", "EPS45_2", "EPS45_2", "EPS45_2", "EPS45_2"), Valeur = c(1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1), ligne.rep = c(2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2), Pied = c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20),Date.floraison.mâle = structure(c(1625356800, 1626480000, 1626393600, NA,1626739200,1626048000, 1626220800, 1626220800, 1626220800, 1626220800, 1625702400, 1626566400,1626048000, 1626739200, NA, 1626048000, 1626307200, 1626480000,1626220800,1626048000), tzone = "UTC", class = c("POSIXct", "POSIXt")), Date.floraison.femelle = structure(c(1625702400, NA, NA, NA, 1626825600, 1625875200, 1626739200, 1626220800, 1626480000, 1626048000, 1626739200, NA, 1626307200, 1627171200, NA, 1626220800, 1626739200, 1626825600, 1626566400, 1626307200), tzone = "UTC", class = c("POSIXct", "POSIXt")), ASIi = c(4, -44394, -44393, 0, 1, -2, 6, 0, 3, -2, 12, -44395, 3, 5, 0, 2, 5, 4, 4, 3), Com = c(NA, NA, NA, "N", NA, NA, NA, NA, NA, NA, NA, "Nch", NA, NA, "N", NA, NA, NA, NA, NA), Detruit = c(NA_character_, NA_character_, NA_character_, NA_character_,NA_character_, NA_character_, NA_character_, NA_character_, NA_character_, NA_character_, NA_character_, NA_character_, NA_character_, NA_character_, NA_character_, NA_character_, NA_character_, NA_character_, NA_character_, NA_character_)), row.names = c(NA, -20L), class = c("tbl_df", "tbl", "data.frame"))

I would like to exclude from my dataframe all the lines with a commentary (column 'Com') like N, or Nch.

So for example:

in:

  Com               
1.NA             
3.NA

So at the end, I only have in my column 'Com', NA values.

I have a second issues. In the column "Varietes.rep" I got different varieties, and i would like to know for each of them, how many lines have in column "Com" 'NA' values, 'N' values, 'Nch' values and so one.

Do you have idea how can I do ?

Thank you very much.

Zoe · Accepted Answer

Using only base R, first part (getting all rows where column "Com" is NA):

df[is.na(df$Com),]

second part (getting counts of uniques in column "Varieties.rep"):

summary(as.factor(df$Varietes.rep))

How to exclude lines from a dataframe ( with condition )?

Answers (2)

Related Questions