Reputation: 55
I have a DataFrame with such a structure[1] and i want to multiply the string and integer columns.
+----------------------+------------+-------------------------+-----------+--+
| url | date | word | mentioned | |
|----------------------+------------+-------------------------+-----------+--+
| newspaperarticle.com | 2018-12-22 | [canada,house,micheal] | [2,2,1] | |
| articleUSA.com | 2018-12-23 | [new york,murder,angry] | [2,3,1] | |
+----------------------+------------+-------------------------+-----------+-
And I want the multiplied number of words in the column name
+----------------------+------------+-------------------------+-------+---+--+
| url | date | word |mentioned
|----------------------+------------+-------------------------+-------+---+--+
| newspaperarticle.com | 2018-12-22 | [canada,canada,house,..] |[2,2,1]
| articleUSA.com | 2018-12-23 | [new york,new york,murder,..] |[2,3,1]
+----------------------+------------+-------------------------+-------+---+--+
What i did so far was multiplying the columns with the multiply method that didnt work. I also tried it with for loops with indexing the single elements and multiplying them but always go the error string out of index.
Upvotes: 2
Views: 780
Reputation: 75080
You can explode
and use series.repeat
, the aggregate as list on level=0:
s = [df[i].explode() for i in ['word','mentioned']]
df['word'] = s[0].repeat(s[1]).groupby(level=0).agg(list)
print(df)
url date \
0 newspaperarticle.com 2018-12-22
1 articleUSA.com 2018-12-23
word mentioned
0 [canada, canada, house, house, micheal] [2, 2, 1]
1 [new york, new york, murder, murder, murder, a... [2, 3, 1]
Note: This is assuming that word
and mentioned
columns are series of lists and not string representation of lists.
Upvotes: 3