Comparing two CSV files in R with some conditions

Question

I have two csv files:

File 1:

Year,Month,Day,Stn1,Stn2,Stn3
1979,01,01,10,0,5
1979,01,02,10,1,5,
1979,01,03,0,0,0
1979,01,04,5,10,30
1979,01,05,0,1,3

File 2:

Year,Month,Day
1979,01,02
1979,01,04
1979,01,05

File 1 contains daily data from 1979 to 2000, while File 2 contains random dates (non continuous).

What I want:

[1] Get the common dates between File 1 and File 2 if ANY of the station columns (Stn1 to 3) have values greater than or equal to 20. Then, save the output to a file.

In the above example, the output file should contain the following date:

Year,Month,Day,Stn1,Stn2,Stn3
1979,01,04,5,10,30

Since Stn 3, has a value of 30.

What I have so far:

I can get the common dates even by using a simple bash command. Unfortunately, I don't know how to filter the common dates satisfying the condition.I was wondering how to do this in R.

I'll appreciate any help on this matter.

-- Lyndz

Alexis · Accepted Answer

Try this code:

library(tidyverse)
dataset <- data.frame(Year = c("1979","1979","1979","1979","1979"),
                      Month = c("01","01","01","01","01"),
                      Day = c("01","02","03","04","05"),
                      Stn1 = c(10,10,0,5,0),
                      Stn2 = c(0,1,0,10,1),
                      Stn3 = c(5,5,0,30,3),
                      stringsAsFactors = FALSE)

dataset <- dataset %>% mutate(date = paste0(Year,Month,Day))   
filterdata <- data.frame(Year = c("1979","1979","1979"),
                         Month = c("01","01","01"),
                         Day = c("02","04","05"),
                         stringsAsFactors = FALSE)
filterdata <- filterdata %>% mutate(date = paste0(Year,Month,Day))

dataset %>% semi_join(filterdata, by = 'date') %>% filter(Stn1 >= 20 | Stn2 >= 20 | Stn3 >=20) %>% select(-date)

You can filter your data by or condition.

Regards.

Comparing two CSV files in R with some conditions

Answers (2)

Related Questions