user1896796
user1896796

Reputation: 749

Adding a new row to a dataframe with correct mapping in pandas

I have a dataframe something like below-

      carrier_plan_identifier           ...            hios_issuer_identifier
1                        AUSK           ...                           99806.0
2                        AUSM           ...                           99806.0
3                        AUSN           ...                           99806.0
4                        AUSS           ...                           99806.0
5                        AUST           ...                           99806.0

I need to pick a particular column ,lets say wellthie_issuer_identifier.

I need to query the database based on this column value. My select query will look something like .

select id, wellthie_issuer_identifier from issuers where wellthie_issuer_identifier in(....)

I need to add id column back to my existing dataframe with respect to the wellthie_issuer_identifier.

I have searched a lot but not clear with how this can be done.

Upvotes: 0

Views: 719

Answers (3)

Mayank Porwal
Mayank Porwal

Reputation: 34086

Try this:

1.) pick a particular column ,lets say wellthie_issuer_identifier

t = tuple(df.wellthie_issuer_identifier) This will give you a tuple like (1,0,1,1)

2.) query the database based on this column value

You need to substitute the above tuple in your query:

query = """select id, wellthie_issuer_identifier from issuers 
where wellthie_issuer_identifier in{} """

Create a Cursor to the database and execute this query and Create a Dataframe of the result.

cur.execute(query.format(t))
df_new = pd.DataFrame(cur.fetchall())
df_new.columns = ['id','wellthie_issuer_identifier']

Now your df_new will have columns id, wellthie_issuer_identifier. You need to add this id column back to existing df.

Do this: df = pd.merge(df,df_new, on='wellthie_issuer_identifier',how='left')

It will add an id column to df which will have values if a match is found on wellthie_issuer_identifier, otherwise it will put NaN.

Let me know if this helps.

Upvotes: 1

DevTotti
DevTotti

Reputation: 86

You can add another column to a dataframe using pandas if the column is not too long, For example:

import pandas as pd

df = pd.read_csv('just.csv')
df

   id  user_id  name
0   1        1  tolu
1   2        5    jb
2   3        6   jbu
3   4        7   jab
4   5        9   jbb

#to add new column to the data above

df['new_column']=['jdb','biwe','iuwfb','ibeu','igu']#new values
df

   id  user_id  name new_column
0   1        1  tolu        jdb
1   2        5    jb       biwe
2   3        6   jbu      iuwfb
3   4        7   jab       ibeu
4   5        9   jbb        igu

#this should help if the dataset is not too much 

then you can go on querying your database

Upvotes: 1

Rahul Agarwal
Rahul Agarwal

Reputation: 4100

This will not take values for wellthie_issuer_identifier but as you told it will be all the values that are their, then below should work for you:

df1 = df.assign(id=(df['wellthie_issuer_identifier']).astype('category').cat.codes)

Upvotes: 0

Related Questions