Reputation: 623
I have measured gas emission from jars in a long time series. My data set consists of three columns: date
, time
and jar
.
The jars were measured in a time series according to first "a" then "b" and then "c", but I don't have this information in my dataset. Therefore I want to make a new column in my data set that says if the jar were measured according to "a", "b" or "c".
The things I have tried so far has not given the expected outcome. Any ideas?
The data looks like this:
df <- structure(list(date = c("2021-03-14", "2021-03-14", "2021-03-14",
"2021-03-14", "2021-03-14", "2021-03-14", "2021-03-14", "2021-03-14",
"2021-03-14", "2021-03-14", "2021-03-14", "2021-03-14", "2021-03-15",
"2021-03-15", "2021-03-15", "2021-03-15", "2021-03-15", "2021-03-15",
"2021-03-15", "2021-03-15", "2021-03-15", "2021-03-15", "2021-03-15",
"2021-03-15", "2021-03-15", "2021-03-15", "2021-03-15", "2021-03-15",
"2021-03-15", "2021-03-15", "2021-03-15", "2021-03-15", "2021-03-15",
"2021-03-15", "2021-03-15", "2021-03-15", "2021-03-15", "2021-03-15",
"2021-03-15", "2021-03-15", "2021-03-15", "2021-03-15", "2021-03-15",
"2021-03-15", "2021-03-15", "2021-03-15", "2021-03-15", "2021-03-15",
"2021-03-15", "2021-03-15", "2021-03-15", "2021-03-15", "2021-03-15",
"2021-03-15", "2021-03-15", "2021-03-15", "2021-03-15", "2021-03-15",
"2021-03-15", "2021-03-15", "2021-03-15", "2021-03-15", "2021-03-15",
"2021-03-15", "2021-03-15", "2021-03-15", "2021-03-15", "2021-03-15"
), time = c("23:55:00", "23:56:00", "23:57:00", "23:58:00", "23:59:00",
"00:01:00", "00:02:00", "00:03:00", "00:04:00", "00:05:00", "00:06:00",
"00:07:00", "00:08:00", "00:09:00", "00:10:00", "00:11:00", "00:12:00",
"00:13:00", "00:16:00", "00:17:00", "00:18:00", "00:19:00", "00:20:00",
"00:21:00", "00:22:00", "00:23:00", "00:24:00", "00:25:00", "00:26:00",
"00:27:00", "00:28:00", "00:29:00", "00:30:00", "00:31:00", "00:32:00",
"00:33:00", "00:34:00", "00:35:00", "00:36:00", "00:37:00", "00:38:00",
"00:39:00", "00:40:00", "00:41:00", "00:42:00", "00:43:00", "00:44:00",
"00:46:00", "00:47:00", "00:48:00", "00:49:00", "00:50:00", "00:51:00",
"00:52:00", "00:53:00", "00:54:00", "00:55:00", "00:56:00", "00:57:00",
"00:58:00", "00:59:00", "01:00:00", "01:01:00", "01:02:00", "01:03:00",
"01:04:00", "01:05:00", "01:06:00"), jar = c(1L, 1L, 1L, 1L,
1L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 3L, 3L, 3L,
3L, 3L, 3L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 3L, 3L, 3L, 3L, 3L, 1L, 1L, 1L, 1L, 1L, 2L, 2L,
2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 3L, 3L, 3L, 3L, 3L, 3L
), expected.outcome = c("a", "a", "a", "a", "a", "a", "a", "a",
"a", "a", "a", "a", "a", "a", "a", "a", "a", "a", "a", "a", "a",
"a", "a", "b", "b", "b", "b", "b", "b", "b", "b", "b", "b", "b",
"b", "b", "b", "b", "b", "b", "b", "b", "b", "b", "b", "c", "c",
"c", "c", "c", "c", "c", "c", "c", "c", "c", "c", "c", "c", "c",
"c", "c", "c", "c", "c", "c", "c", "c")), class = "data.frame", row.names = c(NA,
-68L))
Upvotes: 0
Views: 51
Reputation: 26
The goal seems to be adding a new column based on the change of the column "jar".
dt <- data.table::data.table(df)[, Gas:= ifelse(CONDITION1, "a", ifelse(CONDITION2, "b", "c"))]
For example, with your data, it seems every time jar jumps from 3 to 1 from one row to the next, your expected.outcome changes to the next letter. (I'm not sure that is the exact logic you're looking for because you mentioned time series that change after 40+ minutes, in which case you would need to make changes.) Based on that criteria, you could create a loop to run over the data frame and establish the new column bit by bit.
So the code addition below would reproduce the expected outcome.
addGasVector <- function(df)
{
gases <- c("a", "b", "c")
#initial values
Gas <- vector() #will become a new column
previousJar <- 0
currentGas <- "a"
#loops through every row to create a new column
for (row in 1:nrow(dt))
{
currentJar <- df[row, "jar"]
#criteria you identify for a change of gas, change accordingly
if (previousJar == 3 & currentJar == 1)
currentGas <- gases[match(currentGas, gases) + 1] #change of gas to next letter
Gas <- c(Gas, currentGas) #adds the new column item
previousJar <- currentJar #for the next iteration
}
df <- cbind(df, Gas) #adds the new column
return(df)
}
View(addGasVector(df))
Upvotes: 1