Reputation: 539
I have a raw dataset looks like this:
a619 a6641 a6672 a6741 a686 a6876 a689 a6946 a691
a6976 a40 a4019 b409 b4147 b4111 b416 b4167 b4178
b4186 b4198 b421 b4261 b4211 b4266 b4614 t4641 t4667
t4677 t4681 t4466 t4161 t4149 t4170 t4602 t4664 t461
t4691t t4764 t4767 f4792 f4948 f4988 f1086 f1168 f1184
f1189 f1207 f1222 f1691 f1429 k1468 k1467 k1162 k1149
k1619 k1666 k1669 k1767 k1719 k1772 k1776 k1782 p1827
p1872 p1914 p1921 p1914 p1992 p6 p6094 p6106 p6164
p6114 p6261 w6627 w6671 w6416 w6466 w6469 w6171 w6194
w6666 w6884 w6911 w7 w70 w7016 g7011 g7076 g7091
g7164 g7191 g7266 g7621 g7406 g7426 g7426 g7467 g7106
Put the raw data in a data.txt
and try the followwing codes to construct them into a dataframe
:
library(data.table)
data <- fread("C:\\Desktop\\data.txt", header = F)
My desired output is to pick out the elements with 'k' as the first letter:
k1468 k1467 k1162 k1149 k1619 k1666 k1669 k1767 k1719 k1772 k1776 k1782
I am There is no specific variables corresponding to each column. For this raw data, the only feature I found is that they have different first letter for different chunks. I want to extract the data that the first letter is 'k', that is from k1467 to k1782. I am wondering what syntax can achieve this in R?
Upvotes: 1
Views: 92
Reputation: 90
Since you want a vector of required values, try converting your matrix into a vector and then do an sapply
as below:
d<-c();
sapply(as.vector(your_data_matrix), function(x) { if (substr(x, 1, 1) == 'k') { d <<- c(d, x); }}, USE.NAMES = FALSE);
Your required output will be stored in d.
EDIT:
For a data.table
you will have to unlist
and then do the sapply
as follows:
d<-c();
sapply(as.vector(unlist(your_data_table)), function(x) { if (substr(x, 1, 1) == 'k') { d <<- c(d, x); }}, USE.NAMES = FALSE);
Upvotes: 1