Reputation: 8494
For an example dataframe:
df = structure(list(country = c("AT", "AT", "AT", "BE", "BE", "BE",
"DE", "DE", "DE", "DE", "DE", "DE", "DE", "DE", "DE", "DE", "DE",
"DE", "DE", "DE"), level = c("1", "1", "1", "1", "1", "1", "1",
"1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1"
), region = c("AT2", "AT1", "AT3", "BE2", "BE1", "BE3", "DE4",
"DE3", "DE9", "DE7", "DE1", "DEE", "DEG", "DE2", "DED", "DEB",
"DEA", "DEF", "DE6", "DE8"), N = c("348", "707", "648", "952",
"143", "584", "171", "155", "234", "176", "302", "144", "148",
"386", "257", "126", "463", "74", "44", "119"), result = c("24.43",
"26.59", "20.37", "23.53", "16.78", "25.51", "46.2", "43.23",
"41.03", "37.5", "33.44", "58.33", "47.97", "34.46", "39.69",
"31.75", "36.93", "43.24", "36.36", "43.7")), .Names = c("country",
"level", "region", "N", "result"), class = c("data.table", "data.frame"
), row.names = c(NA, -20L))
I am using the following code to create a summary dataframe, listing the max and min values by country:
variable_country <- setDT(df)[order(country), list(min_result = min(result), max_result = max(result)), by = c("country")]
I also wish to include the variable 'level' from 'df'' - how would I do this in R? i.e. my variable_country dataframe would have an extra column to show that these particular countries are at level (1) . The dataframe should just have an extra column, but still three observations (one for each country). All observations for each country are at the same level.
Upvotes: 1
Views: 85
Reputation: 887891
If there is only a single 'level' for each 'country', we can create the summarised dataset with including the first
observation of 'level' (level[1L]
).
setDT(df)[order(country), list(min_result = min(result),
max_result = max(result), level= level[1L]), by = country]
Having said that, another option would be to use 'level' as the grouping variable, i.e. by = .(country, level)]
in the code. (as suggested by @David Arenburg)
Upvotes: 3