Reputation: 3026
I have a table in R that looks like (below is just a sample):
| | 15 | 17 | 18 | 22 | 25 | 26 | 27 | 29 |
|-------|----|----|----|----|----|----|----|----|
| 10000 | 1 | 2 | 1 | 2 | 4 | 3 | 5 | 2 |
| 20000 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| 30000 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| 40000 | 0 | 0 | 0 | 1 | 2 | 3 | 6 | 3 |
| 50000 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 1 |
| 60000 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
The rows are income levels, and the columns are age levels. I am essentially creating this table to see if age is related to income via a Chi-squared test. The numbers in the table are numbers of occurrences e.g. There are 2 people aged 17 in my dataset with income of 10000.
Both age and income level of type "num" in R so are continuous.
I want to essentially combine the columns for age so that I get a table with everyone who has income of 10k and is between age 15-25, age 25-35, etc. so I end up with much fewer columns.
Note also that colnames(tbl) = "15","17", "18", not "Age" - I haven't defined an overarching name for my columns and rows.
I note this answer does something similar but not sure how to apply it given I don't have a name for my columns e.g. "mpg" (in the case of the link).
Any ideas?
Upvotes: 0
Views: 319
Reputation: 524
Made my own matrix here, but should work for df's aswell.
mat <- matrix(sample(1:10,8500,replace = TRUE),ncol=85)
colnames(mat) <- 15:99
levs <- cut(as.numeric(colnames(mat)),seq(15,105,10),right = FALSE)
res <- sapply(as.character(unique(levs)),function(x)rowSums(mat[,levs==x]))
Edit: If you want the same colnames as in mat, but counts according to the category, in addition do:
res <- res[,levs] # expands the res df to one category count col pr. original col in mat.
colnames(res) <- colnames(mat) # renames cols to reflect input matrix mat.
Upvotes: 1