Adding values to pandas dataframe with function based on other column in dataframe

Question

This sounds similar to a lot of SO questions but I haven't actually found it; if it's here, please feel free to link and I'll delete.

I have two dataframes. The first looks like this:

owned   category                            weight  mechanics_split
28156   Environmental, Medical              2.8023  [Action Point Allowance System, Co-operative P...
9269    Card Game, Civilization, Economic   4.3073  [Action Point Allowance System, Auction/Biddin...
36707   Modern Warfare, Political, Wargame  3.5293  [Area Control / Area Influence, Campaign / Bat...

The second looks like this:

    type                            amount  owned
0   Action Point Allowance System   378     0
1   Co-operative Play               302     0
2   Hand Management                 1308    0
3   Point to Point Movement         278     0
4   Set Collection                  708     0
5   Trading                         142     0

What I'm trying to do is iterate over each word in mechanics_split so that the owned value in the first dataframe is added to the owned column in the second dataframe. For example, if Dice Rolling is in the first row of games in the mechanics_split column, the owned amount for that whole row is added to games_owned['owned'], and so on, for each value in the list in mechanics_split through the whole dataframe.

So far, I've tried:

owned_dict = {}
def total_owned(x):
    for e in x:
        if e not in owned_dict:
            owned_dict[e] = 0
        if e in owned_dict:
            owned_dict[e] += games['owned'][x]
    return owned_dict

which returned:

KeyError: "None of [['Action Point Allowance System', 'Co-operative Play', 'Hand Management', 'Point to Point Movement', 'Set Collection', 'Trading', 'Variable Player Powers']] are in the [index]"

If I add another letter before e, I'm told there are too many values to unpack. I also tried skipping the dictionary and just using otherdf['owned'][e] += games['owned'][x] to no avail.

I may be fundamentally misunderstanding something about how indexes work in pandas and how to index a value to a row, so if I am, please let me know. Thanks very much for any help.

EDIT: I've solved part of the problem by changing the index of the second dataframe to the 'types' column with `otherdf.index = otherdf.types', but I'm still left with the problem of transferring the owned values from the first dataframe.

user7037221 · Accepted Answer

I agree with you that using the 'type' column as a label-based index will make things easier. With this done, you can iterate over the rows of the first dataframe, then add owned value to the appropriate row in the second dataframe using the .loc method.

for row_1 in df_1.itterrows():
  owned_value = row_1[1]['owned'] #iterrows() enumeration generator over rows      
  mechanics =  row_1[1]['mechanics_split']
  for type_string in mechanics:
    df_2.loc[type_string,('owned')] += owned_value

In addition, I suggest reading on how Pandas handles indexing to help avoid any 'gotchas' as you continue to work with Python.

Adding values to pandas dataframe with function based on other column in dataframe

Answers (1)

Related Questions