ilanos
ilanos

Reputation: 193

write insert query to insert dynamic number of dataframe columns to cassandra table using python

I have a cassandra table with 30 columns and a dataframe which will contain any number of columns as selected by the user, or as uploaded by user as a csv file.

for ex. my cassandra table has 30 channel names like channel_1, channel_2 .. channel_30 and few extra columns like product, region etc. and my dataframe has 5 channel names like channel_1, channel_2..channel_5 and 2 extra columns : date and model name how do i write an insert query which iterates through my dataframe columns and creates the query as follows:

data = pd.DataFrame()
...
for _, row in data.iterrows():

            session.execute("""INSERT INTO cass_table (product, region , model, date, (channel names in dataframe ex.channel_1, \
                    channel_2, channel_3, channel_4, channel_5) \
                    VALUES (%s, %s, %s, %s, <no of channels: %s, %s, %s, %s, %s>)""" ,\
                    (product, region, row['model_name'], row['date'], <other remaining columns like row["channel_1"], row["channel_2"],\
                    row["channel_3"], row["channel_4"], row["channel_5"]>))

I tried this:

channel_list = ['channel_1', 'channel_2', 'channel_3', 'channel_4', 'channel_5']
session.execute("""INSERT INTO cass_table(product, region , model, date, \
                    """ + str(' ,'.join('channel_{}_cont'.format(i) for i,c in enumerate(channel_list, 1))) + """) ,\
                    VALUES (%s, %s, %s, %s"""+ str(''.join(', %s'*no_of_channels)), """)""" \
                    (product, region, row['model_name'], row[date_col] , ','.join(row["'"+channel_list+"'"])))

but this gives error: "str object is not callable"

How do I proceed?

Upvotes: 0

Views: 283

Answers (1)

0x26res
0x26res

Reputation: 13902

In practice that should work but the way you are writing your query has got a few bugs.

  • You're missing a , between the query pattern and the values
  • When generating the tuple of values you are inserting, you want to use the * operator. It will expend each element of the list as an argument to the function. see this question
query = (
    "INSERT INTO cass_table (product, region , model, date, "
    + ', '.join(channel_list) + ') '
    + " VALUES (%s, %s, %s, %s, " + ', '.join(['%s'] * len(channel_list)) + ")"
)

values = (product, region, row['model_name'], row[date_col] , *[row[c] for c in channel_list])
session.execute(query, values)

Upvotes: 1

Related Questions