Reputation: 2385
I want to grep the lines were the string before .
in column V1
and V2
is similar for the same row. For instance in the examples below row 1 would be such a case.
I guess I need to use gsub
somehow combined with == gsub( ".*$", "", out )
> head(out)
V1 V2 V3 V4
1 hsa-miR-99b-5p.dataSerum hsa-miR-99b-5p.dataTissue 0.261887741880618 <NA>
2 hsa-miR-99b-3p.dataTissue hsa-miR-99b-5p.dataTissue 0.979410208303266 <NA>
3 hsa-miR-99b-3p.dataTissue hsa-miR-99b-5p.dataSerum 0.266705152258623 <NA>
4 hsa-miR-99b-3p.dataSerum hsa-miR-99b-5p.dataTissue 0.227329471105902 <NA>
5 hsa-miR-99b-3p.dataSerum hsa-miR-99b-5p.dataSerum 0.944112218530823 <NA>
6 hsa-miR-99b-3p.dataSerum hsa-miR-99b-3p.dataTissue 0.20025336348038 <NA>
Upvotes: 1
Views: 38
Reputation: 887048
We can try sub
. Match the pattern dot (\\.
) followed by zero or more characters (.*
) and replace it with ''
for columns 'V1' and 'V2', then use ==
to get the logical index and subset the rows.
v1 <- sub('\\..*', '', out$V1)
v2 <- sub('\\..*', '', out$V2)
out[v1==v2,]
# V1 V2 V3 V4
#1 hsa-miR-99b-5p.dataSerum hsa-miR-99b-5p.dataTissue 0.2618877 <NA>
#6 hsa-miR-99b-3p.dataSerum hsa-miR-99b-3p.dataTissue 0.2002534 <NA>
Upvotes: 1