Reputation: 1949
I have a huge text file that I import to mathematica. It looks something like this:
In[9]:=import=SplitBy[Import["textfile.txt","List"],"\\t"];
Out[9]:={{ A 021 2.3 A 002 2.6},{ A 012 2.3 A 001 2.6},{ A 120 2.6 A 111 2.9},{ A 122 2.8 A 121 2.8},{ A 000 1.3 A 121 2.9},{ A 110 2.4 A 111 2.9},{ G 010 2.3 G 001 2.6},{ G 000 2.2 G 001 2.3 G 010 2.4},{ G 010 2.3 G 001 2.6},{ G 110 2.3 G 101 2.6}}
EDIT: note that all elements are separated by a \\t
character.
This is a list of strings such that
In[12]:= Head@import
Head@import[[1]]
Head@import[[All, 1]]
Head@import[[1, 1]]
Out[12]= List
Out[13]= List
Out[14]= List
Out[15]= String
My big problem is in converting this list to a manageable list of elements, such that I can search for those where G is present, not where A is present. I tried replacing the parts of the strings where was present by a . But then I still cant treat the data as I wanted because I doesn't allow me to search for individual G elements. Ideally, what I wanted to get in the end would be
{{G,010,2.3},{G,001,2.6},{G,000,2.2},{G,001,2.3},{G,010,2.4},{G,010,2.3},{G,001,2.6},{G,110,2.3},{G,101,2.6}}
I already know that I will have to use the Take
command, the Partition
command to split the sublist in sublists of 3 elements, and so on. But because Im not even able to get the data in a list of lists, I cant make this..
Further, when importing I have to select the "List"
type. If I imported as "Table"
everything would be already half done, but the elements "001" would become "1".
Could you guys help me out please? All help is appreciated! Thanks
Upvotes: 0
Views: 1840
Reputation: 78316
I don't have Mathematica on this machine, so my syntax may be a bit awry.
Doesn't
niceList = Partition[Flatten[import],3]
produce a list of lists, where each list at the inner level comprises 3 strings ? Then, something like
Select[niceList,#[[1]]=="G"&]
should select the sub-lists which have a "G"
as the first element.
EDIT
If I understand you now, you mean that in your variable import
you have a list of lists, each of the lower level lists, such as
{ A 021 2.3 A 002 2.6}
contains a single string ? In other words
FullForm[ A 021 2.3 A 002 2.6]
returns
" A 021 2.3 A 002 2.6"
I would import the data, replace all the tab characters with spaces and then use StringSplit[]
(at the right level) to turn each string into a list of strings. Then Flatten
, Partition
, etc. You might find it easiest to start by importing the entire contents of the file into a single string at first
Upvotes: 2
Reputation: 24336
In the future it would be very helpful if you could include a sample of the actual file you are importing. Nevertheless I believe I can guess the format of the file with sufficient accuracy to recommend this:
data = ReadList["textfile.txt", {Word, Number, Number}]
If the file is in the format that I hope it should return:
{{"A", 21, 2.3}, {"A", 2, 2.6}, {"A", 12, 2.3}, {"A", 1, 2.6}, {"A",
120, 2.6}, {"A", 111, 2.9}, {"A", 122, 2.8}, {"A", 121, 2.8}, {"A",
0, 1.3}, {"A", 121, 2.9}, {"A", 110, 2.4}, {"A", 111, 2.9}, {"G",
10, 2.3}, {"G", 1, 2.6}, {"G", 0, 2.2}, {"G", 1, 2.3}, {"G", 10,
2.4}, {"G", 10, 2.3}, {"G", 1, 2.6}, {"G", 110, 2.3}, {"G", 101,
2.6}}
From there getting records that start with "G"
can be done with any of these at your preference:
Cases[data, {"G", ___}]
Select[data, "G" === #[[1]] &]
Pick[data, First /@ data, "G"]
Upvotes: 4