Reputation: 2150
I need to fill a data frame gradually. in each step, I have a data like this:
pubid = 1
keywords = [2, 2,3]
knowing that the length of values for the column are not equal how can I form a data frame like this:
pubid keyword
1 2
1 2
1 3
So next time when the new data is coming and is like this:
pubid = 6
keywords = [10, 11]
my data frame becomes:
pubid keyword
1 2
1 2
1 3
6 10
6 11
I tried to create a temp dataframe at the beginning and add the values like this:
data = {'pubid': 1, 'keywords':[1]}
df = pd.DataFrame(data)
pubid = 3
keyword = [2, 3]
df['pubid'] = 3
df["keywords"] = df["pubid"].apply(lambda x: i for i in keyword)
It does not work in this way, but don't know how can I solve it.
Upvotes: 1
Views: 31
Reputation: 11209
Create a new DataFrame first:
pubid = 6
keywords = [10, 11]
dr = pd.DataFrame({'pubid' : ['pubid' for _ in keywords], 'keyword': keywords]})
Then append it to the original DataFrame.
Depending on the amount of data you have, you may or may not experience slugish performance.
Upvotes: 1
Reputation: 195528
pubid = 1
keywords = [2, 2,3]
df = pd.DataFrame({'pubid': pubid, 'keywords': keywords})
print(df)
Prints:
pubid keywords
0 1 2
1 1 2
2 1 3
Then you can use pd.concat
to add data to existing DataFrame:
pubid = 6
keywords = [10, 11]
df = pd.concat([df, pd.DataFrame({'pubid': pubid, 'keywords': keywords})]).reset_index(drop=True)
print(df)
Prints:
pubid keywords
0 1 2
1 1 2
2 1 3
3 6 10
4 6 11
Upvotes: 1