jchan
jchan

Reputation: 111

Creating a list of sliced dataframes

I am trying to create a list of dataframes where each dataframe is 3 rows of a larger dataframe.

    dframes = [df[0:3], df[3:6],...,df[2000:2003]]

I am still fairly new to programming, why does:

    x = 3
    dframes = []
    for i in range(0, len(df)):
        dframes = dframes.append(df[i:x])
        i = x
        x = x + 3

dframes = dframes.append(df[i:x])
AttributeError: 'NoneType' object has no attribute 'append'

Upvotes: 2

Views: 65

Answers (3)

piRSquared
piRSquared

Reputation: 294488

Use np.split

Setup
Consider the dataframe df

df = pd.DataFrame(dict(A=range(15), B=list('abcdefghijklmno')))

Solution

dframes = np.split(df, range(3, len(df), 3))

Output

for d in dframes:
    print(d, '\n')

   A  B
0  0  a
1  1  b
2  2  c 

   A  B
3  3  d
4  4  e
5  5  f 

   A  B
6  6  g
7  7  h
8  8  i 

     A  B
9    9  j
10  10  k
11  11  l 

     A  B
12  12  m
13  13  n
14  14  o 

Upvotes: 4

jezrael
jezrael

Reputation: 863266

You can use list comprehension with groupby by numpy array created by length of index floor divided by 3:

np.random.seed(100)
df = pd.DataFrame(np.random.randint(10, size=(10,5)), columns=list('ABCDE'))
print (df)
   A  B  C  D  E
0  8  8  3  7  7
1  0  4  2  5  2
2  2  2  1  0  8
3  4  0  9  6  2
4  4  1  5  3  4
5  4  3  7  1  1
6  7  7  0  2  9
7  9  3  2  5  8
8  1  0  7  6  2
9  0  8  2  5  1

dfs = [x for i, x in df.groupby(np.arange(len(df.index)) // 3)]
print (dfs)
[   A  B  C  D  E
0  8  8  3  7  7
1  0  4  2  5  2
2  2  2  1  0  8,    A  B  C  D  E
3  4  0  9  6  2
4  4  1  5  3  4
5  4  3  7  1  1,    A  B  C  D  E
6  7  7  0  2  9
7  9  3  2  5  8
8  1  0  7  6  2,    A  B  C  D  E
9  0  8  2  5  1]

If default monotonic index (0,1,2...) solution can be simplify:

dfs = [x for i, x in df.groupby(df.index // 3)]

Upvotes: 2

amarynets
amarynets

Reputation: 1815

Python raise this error because function append return None and next time in your loot variable dframes will be None

You can use this:

[list(dframes[i:i+3]) for i in range(0, len(dframes), 3)]

Upvotes: 2

Related Questions