Reputation: 43595
I am trying to get a 1
for the row 2020-01
, only when the row crosses the column "Jan". And the same for every month:
Feb
and row 2019-02
should also result in 1
and the rest of row 2019-02
should be 0
.Mar
and row 2019-03
should also result in 1
and the rest of row 2019-03
should be 0
.Thus, all together, once per row, there should be a single 1
and a few 0
, based on the months. This is what I tried and the result is the screenshot, without the blue edit.
columns = ["Jan", "Feb", "Mar", "Apr", "May", "Jun", "Jul", "Aug", "Sep", "Oct", "Nov"]
for i in range(len(columns)):
df[columns[i]] = df.TIME.astype(str).str[5] + df.TIME.astype(str).str[6]
df
The blue edit is the target.
I have tried a ternary operators like this:
for i in range(len(columns)):
df[columns[i]] = 1 if (df.TIME.astype(str).str[5] + df.TIME.astype(str).str[6] == "01") else 0
and the error is:
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
Upvotes: 1
Views: 269
Reputation: 18647
You can use datetime.dt.stftime
with %b
formatter, get_dummies
, reindex
and join
back onto original DataFrame:
# Example setup
columns = ["Jan", "Feb", "Mar", "Apr", "May", "Jun", "Jul", "Aug", "Sep", "Oct", "Nov"]
df = pd.DataFrame({'TIME': ['2020-01', '2019-12', '2019-11', '2019-10', '2019-09']})
df.join(pd.to_datetime(df['TIME']).dt.strftime('%b')
.str.get_dummies()
.reindex(columns=columns, fill_value=0))
[out]
TIME Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov
0 2020-01 1 0 0 0 0 0 0 0 0 0 0
1 2019-12 0 0 0 0 0 0 0 0 0 0 0
2 2019-11 0 0 0 0 0 0 0 0 0 0 1
3 2019-10 0 0 0 0 0 0 0 0 0 1 0
4 2019-09 0 0 0 0 0 0 0 0 1 0 0
I've only added this since you requested it specifically... Here is an example of how you could loop through a dataframe and columns to update values - again I'll just reiterate that this is not what I'd recommend personally and is very inefficient comapred with the above:
import datetime as dt
columns = ["Jan", "Feb", "Mar", "Apr", "May", "Jun", "Jul", "Aug", "Sep", "Oct", "Nov"]
df = pd.DataFrame({'TIME': ['2020-01', '2019-12', '2019-11', '2019-10', '2019-09']})
for c in columns:
for i, t in df['TIME'].iteritems():
if dt.datetime.strptime(t, '%Y-%m').strftime('%b') == c:
df.loc[i, c] = 1
else:
df.loc[i, c] = 0
Upvotes: 2