Reputation: 17
Here is my dataframe
dput(head(df, 20))
structure(list(Varietes = c("EPS45", "EPS45", "EPS45", "EPS45", "EPS45", "EPS45","EPS45", "EPS45", "EPS45", "EPS45", "EPS45", "EPS45", "EPS45", "EPS45", "EPS45", "EPS45", "EPS45", "EPS45", "EPS45", "EPS45"), Varietes.rep = c("EPS45_2", "EPS45_2", "EPS45_2", "EPS45_2", "EPS45_2", "EPS45_2", "EPS45_2", "EPS45_2", "EPS45_2", "EPS45_2","EPS45_2", "EPS45_2", "EPS45_2", "EPS45_2", "EPS45_2", "EPS45_2", "EPS45_2", "EPS45_2", "EPS45_2", "EPS45_2"), Valeur = c(1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1), ligne.rep = c(2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2), Pied = c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20),Date.floraison.mâle = structure(c(1625356800, 1626480000, 1626393600, NA,1626739200,1626048000, 1626220800, 1626220800, 1626220800, 1626220800, 1625702400, 1626566400,1626048000, 1626739200, NA, 1626048000, 1626307200, 1626480000,1626220800,1626048000), tzone = "UTC", class = c("POSIXct", "POSIXt")), Date.floraison.femelle = structure(c(1625702400, NA, NA, NA, 1626825600, 1625875200, 1626739200, 1626220800, 1626480000, 1626048000, 1626739200, NA, 1626307200, 1627171200, NA, 1626220800, 1626739200, 1626825600, 1626566400, 1626307200), tzone = "UTC", class = c("POSIXct", "POSIXt")), ASIi = c(4, -44394, -44393, 0, 1, -2, 6, 0, 3, -2, 12, -44395, 3, 5, 0, 2, 5, 4, 4, 3), Com = c(NA, NA, NA, "N", NA, NA, NA, NA, NA, NA, NA, "Nch", NA, NA, "N", NA, NA, NA, NA, NA), Detruit = c(NA_character_, NA_character_, NA_character_, NA_character_,NA_character_, NA_character_, NA_character_, NA_character_, NA_character_, NA_character_, NA_character_, NA_character_, NA_character_, NA_character_, NA_character_, NA_character_, NA_character_, NA_character_, NA_character_, NA_character_)), row.names = c(NA, -20L), class = c("tbl_df", "tbl", "data.frame"))
I would like to exclude from my dataframe all the lines with a commentary (column 'Com') like N, or Nch.
So for example:
Com
1. NA
2. N
3. NA
4. Nch
in:
Com
1.NA
3.NA
So at the end, I only have in my column 'Com', NA values.
I have a second issues. In the column "Varietes.rep" I got different varieties, and i would like to know for each of them, how many lines have in column "Com" 'NA' values, 'N' values, 'Nch' values and so one.
Do you have idea how can I do ?
Thank you very much.
Upvotes: 0
Views: 82
Reputation: 1000
Using only base R, first part (getting all rows where column "Com" is NA):
df[is.na(df$Com),]
second part (getting counts of uniques in column "Varieties.rep"):
summary(as.factor(df$Varietes.rep))
Upvotes: 0
Reputation: 3256
Answer to first question:
library(dplyr)
df %>% filter(is.na(Com))
Answer to the second question:
df %>% group_by(Varietes.rep) %>% count(Com)
Upvotes: 1