Extract data from column to new column after groupby

Question

I am trying to extract and split a column of data into two new columns. This is reasonably simple using .str[xx:yy]

However, I am getting an error when I am trying to complete the same after having run a .groupby(["xxx", "yyy", "zzz"])[["aaa"]].count() I get told that KeyError: 'Column name'

As an example, after running the .groupby command, I get the following table:

DevReg_df = DevReg_df.groupby(["Dev Loc", "Dev Model", "Dev OS"])[["Number"]].count()

+---------+-----------+--------+--------+----------+
| Dev Loc | Dev Model | Dev OS | Number |          |
+---------+-----------+--------+--------+----------+
| ar_ar   | Sam 22    | A 5    |      2 |          |
|         | Sam 4     | A 6    |      5 |          |
| az_ae   | i 1       | i 2    |      6 |          |
|         | i 34      | i 9    |     98 |          |
+---------+-----------+--------+--------+----------+

Running the following code afterwards gets me:

DevReg_df["Language"] = DevReg_df["Dev Loc"].str[:2]

KeyError: 'Dev Loc'

The outcome of the table I am expecting is:

+---------+-----------+--------+--------+----------+
| Dev Loc | Dev Model | Dev OS | Number | Language |
+---------+-----------+--------+--------+----------+
| ar_ar   | Sam 22    | A 5    |      2 | ar       |
|         | Sam 4     | A 6    |      5 | ar       |
| az_ae   | i 1       | i 2    |      6 | az       |
|         | i 34      | i 9    |     98 | az       |
+---------+-----------+--------+--------+----------+

Edit: Reason that I don't want to do the commands in the reverse order (before the groupby) is that there are more than 30m rows which the groupby reduces to less than 100k. Doing the other way causes large delays in completing the script.

Extract data from column to new column after groupby

Answers (1)

Related Questions