Reputation: 63984
I have data that looks like this:
1.00 1.00 1.00
3.23 4.23 0.33
1.23 0.13 3.44
4.55 12.3 14.1
2.00 2.00 2.00
1.21 1.11 1.11
3.55 5.44 5.22
4.11 1.00 4.00
It comes in chunk of 4. The first line of the chunk is index and the rest are the values. The chunk always comes in 4 lines, but number of columns can be more than 3.
For example:
1.00 1.00 1.00 <- 1st chunk, the index = 1
3.23 4.23 0.33 <- values
1.23 0.13 3.44 <- values
4.55 12.3 14.1 <- values
My example above only contains 2 chunks, but actually it can contain more than that.
What I want to do is to create a dictionary of data frames so I can process them chunk by chunk. Namely from this:
In [1]: import pandas as pd
In [2]: df = pd.read_table("http://dpaste.com/29R0BSS.txt",header=None, sep = " ")
In [3]: df
Out[3]:
0 1 2
0 1.00 1.00 1.00
1 3.23 4.23 0.33
2 1.23 0.13 3.44
3 4.55 12.30 14.10
4 2.00 2.00 2.00
5 1.21 1.11 1.11
6 3.55 5.44 5.22
7 4.11 1.00 4.00
Into list of data frame, such that I can do something like this (I do this by hand):
>> # Let's call new data frame `nd`.
>> nd[1]
>> 0 1 2
0 3.23 4.23 0.33
1 1.23 0.13 3.44
2 4.55 12.30 14.10
Upvotes: 1
Views: 121
Reputation: 353009
There are lots of ways to do this; I tend to use groupby
, e.g. something like
>>> grouped = df.groupby(np.arange(len(df)) // 4)
>>> d = {v.iloc[0][0]: v.iloc[1:].reset_index(drop=True) for k,v in grouped}
>>> for k,v in d.items():
... print(k)
... print(v)
...
1.0
0 1 2
0 3.23 4.23 0.33
1 1.23 0.13 3.44
2 4.55 12.30 14.10
2.0
0 1 2
0 1.21 1.11 1.11
1 3.55 5.44 5.22
2 4.11 1.00 4.00
Upvotes: 5