Reputation: 387
I have a dataframe with 6 columns (excluding the index), 2 of which are relevant inputs to a function and that function has two outputs. I'd like to insert these outputs to the original dataframe as columns.
I'm following toto_tico's answer here. I'm copying for convenience (with slight modifications):
import pandas as pd
df = pd.DataFrame({"A": [10,20,30], "B": [20, 30, 10], "C": [10, 10, 10], "D": [1, 1, 1]})
def fab(row):
return row['A'] * row['B'], row['A'] + row['B']
df['newcolumn'], df['newcolumn2'] = zip(*df.apply(fab, axis=1))
This code works without a problem. My code, however, doesn't. My dataframe has the following structure:
Date Station Insolation Daily Total Temperature(avg) Latitude
0 2011-01-01 Aksaray 1.7 72927.6 -0.025000 38.3705
1 2011-01-02 Aksaray 5.6 145874.7 2.541667 38.3705
2 2011-01-03 Aksaray 6.3 147197.8 6.666667 38.3705
3 2011-01-04 Aksaray 2.9 100350.9 5.312500 38.3705
4 2011-01-05 Aksaray 0.7 42138.7 4.639130 38.3705
The function I'm applying takes a row as input, and returns two values based on Latitude and Date. Here's that function:
def h0(row):
# Get a row from a dataframe, give back H0 and daylength
# Leap year must be taken into account
# row['Latitude'] and row['Date'] are relevant inputs
# phi is taken in degrees, all angles are assumed to be degrees as well in formulas
# numpy defaults to radians however...
gsc = 1367
phi = np.deg2rad(row['Latitude'])
date = row['Date']
year = pd.DatetimeIndex([date]).year[0]
month = pd.DatetimeIndex([date]).month[0]
day = pd.DatetimeIndex([date]).day[0]
if year % 4 == 0:
B = (day-1) * (360/366)
else:
B = (day-1) * (360/365)
B = np.deg2rad(B)
delta = (0.006918 - 0.399912*np.cos(B) + 0.070257*np.sin(B)
- 0.006758*np.cos(2*B) + 0.000907*np.sin(2*B)
- 0.002697*np.cos(3*B) + 0.00148*np.sin(3*B))
ws = np.arccos(-np.tan(phi) * np.tan(delta))
daylenght = (2/15) * np.rad2deg(ws)
if year % 4 == 0:
dayangle = np.deg2rad(360*day/366)
else:
dayangle = np.deg2rad(360*day/365)
h0 = (24*3600*gsc/np.pi) * (1 + 0.033*np.cos(dayangle)) * (np.cos(phi)*np.cos(delta)*np.sin(ws) +
ws*np.sin(phi)*np.sin(delta))
return h0, daylenght
When I use
ak['h0'], ak['N'] = zip(*ak.apply(h0, axis=1))
I get the error: Shape of passed values is (1816, 2), indices imply (1816, 6)
I'm unable to find what's wrong with my code. Can you help?
Upvotes: 0
Views: 1249
Reputation: 1873
So as mentioned in my previous comment, if you'd like to create multiple NEW columns in the DataFrame based on multiple EXISTING columns of the DataFrame. You can create a new field in the row Series WITHIN your h0
function.
Here's an overly simple example to showcase what I mean:
>>> def simple_func(row):
... row['new_column1'] = row.lat * 1000
... row['year'] = row.date.year
... row['month'] = row.date.month
... row['day'] = row.date.day
... return row
...
>>> df
date lat
0 2018-01-29 1000
1 2018-01-30 5000
>>> df.date
0 2018-01-29
1 2018-01-30
Name: date, dtype: datetime64[ns]
>>> df.apply(simple_func, axis=1)
date lat new_column1 year month day
0 2018-01-29 1000 1000000 2018 1 29
1 2018-01-30 5000 5000000 2018 1 30
In your case, inside your h0
function, setrow['h0'] = h0
and row['N'] = daylength
then return row
. Then when it comes to calling the function the DF your line changes to ak = ak.apply(h0, axis=1)
Upvotes: 1