Reputation: 525
I've a dataframe which contains 40 rows and 10 columns. I want to derive a column if index of rows less than 17, new column is called Prediction equals to 'Value' column. What I mean is that:
ID Year Value
1 2016 114235
2 2016 114235
3 2016 114235
4 2016 114235
5 2016 114235
Then:
ID Year Value Prediction
1 2016 114235 114235
2 2016 114235 114235
3 2016 114235 114235
4 2016 114235 114235
5 2016 114235 114235
I tried to code as follows and all rowns of new column was 'NA'.
newdata$Prediction <- ifelse(nrow(newdata) <= 17, newdata$Value, NA)
newdata$Prediction <- lapply(newdata, function(x) ifelse(nrow(newdata) <= 17, newdata$Value, NA))
It doesn't work. How can I do that?
Upvotes: 0
Views: 999
Reputation: 2496
Instead of using
newdata$Prediction <- ifelse(nrow(newdata) <= 17, newdata$Value, NA)
you can use something like
newdata$Prediction <- ifelse(as.numeric(rownames(newdata)) <= 17, newdata$Value, NA)
The difference here is in understanding how nrow()
and rownames()
work.
for e.g., taking a threshold of 3
, your sample input returns
ID Year Value Prediction
1 1 2016 114235 114235
2 2 2016 114235 114235
3 3 2016 114235 114235
4 4 2016 114235 NA
5 5 2016 114235 NA
While the methods mentioned in the comments to your question are perfectly valid, I still post this because your attempt wasn't too far off.
Alternatively, you could also try using tidyverse
functions:
newdata %>%
mutate(rn = 1:n()) %>%
mutate(Prediction = if_else(rn <= 3, Value, NULL)) %>%
select(-rn)
Upvotes: 2
Reputation: 695
I think that you can just change one small thing in your code to get what you want.
newdata$Prediction <- ifelse(newdata$ID <= 17, newdata$Value, NA)
You already have an ID column that appears to be sorted and thus is acting like the row number. nrow() will just give you the number of rows, and in your case the number of rows for your dataset is greater than 17 so you get NAs on every row.
Upvotes: 2
Reputation: 9809
Do you need the lapply function?
You could just do something like this:
nrow = 20
newdata <- data.frame(ID = 1:nrow,
Year = rep(2016, nrow),
Value = rep(114235, nrow))
newdata$Prediction <- newdata$Value
if (nrow(newdata) > 17) {
newdata[17:nrow(newdata),]$Prediction <- NA
}
newdata
Like this, it wont change the data if less than 17 rows are at hand. Otherwise it would add new rows and fill them with NA.
Upvotes: 2