Reputation: 901
I have a dataframe like this
head(test)
sku array
1 AQ665ELABLKLANID-81796 0,0,0,1,1,1,2
2 AQ665ELABLKMANID-81797 2,0,0,0,1,1,0,0,1
3 AQ665ELABLKNANID-81798 0,1,2,1,1,0,4,1
4 AQ665ELABLKOANID-81799 0,1,0,1
5 AQ665ELABLKPANID-81800 1,4,4,2,3,7,2,2
6 AQ665ELABLKRANID-81802 0,1,1,0
And I would like to add a column named first that contains for each row the first element of array:
test$first = strsplit(test$array,",")[[1]][1]
But what I get is the following :
head(test)
sku array first
1 AQ665ELABLKLANID-81796 0,0,0,1,1,1,2 0
2 AQ665ELABLKMANID-81797 2,0,0,0,1,1,0,0,1 0
3 AQ665ELABLKNANID-81798 0,1,2,1,1,0,4,1 0
4 AQ665ELABLKOANID-81799 0,1,0,1 0
5 AQ665ELABLKPANID-81800 1,4,4,2,3,7,2,2 0
6 AQ665ELABLKRANID-81802 0,1,1,0 0
I dont understand why all the rows get the value only from the array of the first row
Upvotes: 0
Views: 47
Reputation: 193517
I suppose some regex could also be of use here. Something along the lines of the following might come in handy:
gsub("(^[0-9]+)(,.*)", "\\1", test$array)
# [1] "0" "2" "0" "0" "1" "0"
gsub("(^.*?),(.*)", "\\1", test$array, perl=TRUE)
# [1] "0" "2" "0" "0" "1" "0"
There are some packages (like "stringi" and "stringr") that make this kind of stuff easier to do.
library(stringi)
stri_extract_first_regex(test$array, pattern="[0-9]+")
# [1] "0" "2" "0" "0" "1" "0"
This also lets you easily extract the last value with:
stri_extract_last_regex(test$array, pattern="[0-9]+")
# [1] "2" "1" "1" "1" "2" "0"
Upvotes: 1
Reputation: 93813
I think you actually want:
test$first <- sapply(strsplit(test$array,","),"[",1)
test
# sku array first
#1 AQ665ELABLKLANID-81796 0,0,0,1,1,1,2 0
#2 AQ665ELABLKMANID-81797 2,0,0,0,1,1,0,0,1 2
#3 AQ665ELABLKNANID-81798 0,1,2,1,1,0,4,1 0
#4 AQ665ELABLKOANID-81799 0,1,0,1 0
#5 AQ665ELABLKPANID-81800 1,4,4,2,3,7,2,2 1
#6 AQ665ELABLKRANID-81802 0,1,1,0 0
In your attempt,
strsplit(test$array,",")[[1]]
gives you the split-apart version of test$array[1]
, from which you then subset the first element, which happens to be 0
. Hence, all your values end up being 0
.
Upvotes: 2