Reputation: 25
I have a csv file
this file contains data as below –
category_list,Automotive & Sports,Blanks,Cleantech / Semiconductors,Entertainment,Health,Manufacturing,"News, Search and Messaging",Others,"Social, Finance, Analytics, Advertising"
,0,1,0,0,0,0,0,0,0
3D,0,0,0,0,0,1,0,0,0
3D Printing,0,0,0,0,0,1,0,0,0
3D Technology,0,0,0,0,0,1,0,0,0
Accounting,0,0,0,0,0,0,0,0,1
Active Lifestyle,0,0,0,0,1,0,0,0,0
Ad Targeting,0,0,0,0,0,0,0,0,1
Advanced Materials,0,0,0,0,0,1,0,0,0
Adventure Travel,1,0,0,0,0,0,0,0,0
On loading it into mapping data frame ...
mapping <- read.csv(file="mapping.csv", stringsAsFactors = FALSE,sep=",",check.names=FALSE)
data looks as below (as expected)-
I am trying to create a new column in this file, which will have the column name which has a 1 against a particular row. For example, for 3D row, the additional column should get the value of “Manufacturing”. There can be only one "1" against each row.
When I run this command –
mapping$sector_names <- lapply(apply(mapping[2:9], 1, function(x) which(x=="1")),names)
its populating the sector names column correctly. As shown below –
The problem is that when I use the apply function against columns 2 thru 10, its not working, getting values NULL in sector_names in this case –
mapping$sector_names <- lapply(apply(mapping[2:10], 1, function(x) which(x=="1")),names)
The strange thing is that when I use the apply function against, columns 3 thru 10, it works fine…
In short – the question is that when I apply the “Apply” function across columns 2 thru 10, its not working, but any other combination (2 thru 9 or 3 thru 10 etc.) works.
The problem is that the apply function returns column name along with the column number when I use 2 thru 9 but only returns column number when I use 2 thru 10
Ex : - output of
apply(mapping[2:9], 1, function(x) which(x=="1"))
is like this for each row…
[[2]]
Blanks
8
Whereas for apply(mapping[2:10], 1, function(x) which(x=="1"))
is like this for each row…
[[1]] 2
Could anyone please help?
Upvotes: 0
Views: 824
Reputation: 269586
1) If a
is the result of the apply
in the question then just index the column names by it:
mapping$sector_names <- names(mapping)[-1][a]
2) Alternately define mapping1
to be the matrix which is the 0-1 part of mapping
(i.e. all but first column) and nc1
to be its number of columns. Multiplying that matrix by the vector 1, 2, 3, ... will give a vector of column indexes of the 1's. Index the column names of mappping1
by that index vector. This involves no instances of apply
commands.
mapping1 <- as.matrix(mapping[-1])
nc1 <- ncol(mapping1)
mapping$sector_names <- colnames(mapping1)[mapping1 %*% seq_len(nc1)]
This gives:
> mapping$sector
[1] "Blanks"
[2] "Manufacturing"
[3] "Manufacturing"
[4] "Manufacturing"
[5] "Social, Finance, Analytics, Advertising"
[6] "Health"
[7] "Social, Finance, Analytics, Advertising"
[8] "Manufacturing"
[9] "Automotive & Sports"
Upvotes: 2