Reputation: 1578
I'll give a minimal example where I would create numpy arrays inside row elements of a pandas.DataFrame
.
TL;DR: see the screenshot of the DataFrame
This code finds the minimum of a certain function, by using scipy.optimize.brute
, which returns the minimum, variable at which the minimum is found and two numpy arrays at which it evaluated the function.
import numpy
import scipy.optimize
import itertools
sin = lambda r, phi, x: r * np.sin(phi * x)
def func(r, x):
x0, fval, grid, Jout = scipy.optimize.brute(
sin, ranges=[(-np.pi, np.pi)], args=(r, x), Ns=10, full_output=True)
return dict(phi_at_min=x0[0], result_min=fval, phis=grid, result_at_grid=Jout)
rs = numpy.linspace(-1, 1, 10)
xs = numpy.linspace(0, 1, 10)
vals = list(itertools.product(rs, xs))
result = [func(r, x) for r, x in vals]
# idk whether this is the best way of generating the DataFrame, but it works
df = pd.DataFrame(vals, columns=['r', 'x'])
df = pd.concat((pd.DataFrame(result), df), axis=1)
df.head()
I expect that this is not how I am supposed to do this and should maybe expand the lists somehow. How do I handle this in a correct, beautiful, and clean way?
Upvotes: 1
Views: 919
Reputation: 11905
So, even though "beautiful and clean" is subject to interpretation, I'll give you mine, which should give you in turn some ideas. I'm leveraging a multiindex so that you can later easily select pairs of phi/result_at_grid for each point in the evaluation grid. I'm also using apply
instead of creating two dataframes.
import numpy
import scipy.optimize
import itertools
sin = lambda r, phi, x: r * np.sin(phi * x)
def func(row):
"""
Accepts a row of a dataframe (a pd.Series).
df.apply(func, axis=1)
returns a pd.Series with the initial (r,x) and the results
"""
r = row['r']
x = row['x']
x0, fval, grid, Jout = scipy.optimize.brute(
sin, ranges=[(-np.pi, np.pi)], args=(r, x), Ns=10, full_output=True)
# Create a multi index series for the phis
phis = pd.Series(grid)
phis.index = pd.MultiIndex.from_product([['Phis'], phis.index])
# same for result at grid
result_at_grid = pd.Series(Jout)
result_at_grid.index = pd.MultiIndex.from_product([['result_at_grid'], result_at_grid.index])
# concat
s = pd.concat([phis, result_at_grid])
# Add these two float results
s['phi_at_min'] = x0[0]
s['result_min'] = fval
# add the initial r,x to reconstruct the index later
s['r'] = r
s['x'] = x
return s
rs = numpy.linspace(-1, 1, 10)
xs = numpy.linspace(0, 1, 10)
vals = list(itertools.product(rs, xs))
df = pd.DataFrame(vals, columns=['r', 'x'])
# Apply func to each row (axis=1)
results = df.apply(func, axis=1)
results.set_index(['r','x'], inplace=True)
results.head().T # Transposing so we can see the output in one go...
Now you can select all values at the evaluation grid point 2 for example
print(results.swaplevel(0,1, axis=1)[2].head()) # Showing only 5 first
Phis result_at_grid
r x
-1.0 0.000000 -1.745329 0.000000
0.111111 -1.745329 0.193527
0.222222 -1.745329 0.384667
0.333333 -1.745329 0.571062
0.444444 -1.745329 0.750415
Upvotes: 1