Reputation: 2319
I have created an empty dataframe and I have also named the columns, I didn't specify any index:
columns = ['C1','C2']
emp=pd.DataFrame(columns=columns)
I want to populate the emp dataframe with the output I get from a for loop. For example:
j=0
for i in iset:
emp[j]["C1"]=i
emp[j]["C2"]=i*i
So, as a result, assuming iset is 2, 3, 4 I would like to have:
C1 C2
index
1 2 4
2 3 9
3 4 16
How could I do it? Any suggestions are welcome, thanks for the help.
Upvotes: 1
Views: 5689
Reputation: 25189
As soon as you want your df
to be filled row-by-row with a for
loop the following will do:
emp=pd.DataFrame(columns=['C1','C2'])
iset = [2,3,4]
for i,j in enumerate(iset):
emp.loc[i] = [j, j*j]
emp
C1 C2
0 2.0 4.0
1 3.0 9.0
2 4.0 16.0
Upvotes: 1
Reputation: 109528
It is generally very inefficient to append to a dataframe in that manner, as it returns a new copy of the dataframe each time resulting in quadratic copying. You would be better off creating the columns as variables, and then using them to create your dataframe.
iset = [2, 3, 4]
c1 = []
c2 = []
for i in iset:
c1.append(i)
c2.append(i * i)
emp = pd.DataFrame({'C1': c1, 'C2': c2})
>>> emp
C1 C2
0 2 4
1 3 9
2 4 16
Timings
%%timeit
iset = range(1000)
emp = pd.DataFrame(columns=['C1', 'C2'])
for i in iset:
emp = emp.append({'C1': i, 'C2': i * i}, ignore_index=True)
1 loops, best of 3: 1.79 s per loop
%%timeit
iset = range(1000)
c1 = []
c2 = []
for i in iset:
c1.append(i)
c2.append(i * i)
emp = pd.DataFrame({'C1': c1, 'C2': c2})
1000 loops, best of 3: 779 µs per loop
Upvotes: 2
Reputation: 12108
Something like this:
>>> iset
[2, 3, 4]
>>> pd.DataFrame({'C1': iset, 'C2': map(lambda x: x*x, iset)})
C1 C2
0 2 4
1 3 9
2 4 16
Upvotes: 1