Shuvayan Das
Shuvayan Das

Reputation: 1048

How to combine two lists into a dictionary with the values to the keys assigned manually

I have a function which takes in a dataframe, does some transformations and returns the numeric and categorical column names as a list.

cat_cols, num_cols = Data_Type_And_Transformation(df_data_sample, 'MEAN')

cat_cols = 
['var1_m2_Transform',
 'var2_m2_Transform',
 'var2_m3_Transform',
 'var3_m3_Transform',
 'var5_m3_Transform',
 'var8_m3_Transform',
 'var9_m3_Transform']

num_cols = 
['ttl_change_3m',
 'ttl_change_6m',
 'base_rev_3m',
 'csc_ttl_6m']

Then I am trying to create a dictionary whose keys will be the column names and values will be the data type - NUM or CAT as below:

attribute_df_benford_cat = pd.DataFrame()
attribute_df_benford_num = pd.DataFrame()

attribute_df_cat['Attribute'] = cat_cols
attribute_df_cat['Type'] = 'CAT'

attribute_df_num['Attribute'] = num_cols
attribute_df_num['Type'] = 'NUM'

attribute_df = attribute_df_cat.append(attribute_df_num)
attribute_df.set_index('Attribute',inplace = True)

attribute_dict = OrderedDict(attribute_df.to_dict('index'))

But this gives me a dict of the form:

Key                 Type    Size    Value
ttl_change_3m       dict    1       {'Type': 'NUM'}
ttl_change_6m       dict    1       {'Type': 'NUM'}
base_rev_3m         dict    1       {'Type': 'NUM'}
csc_ttl_6m          dict    1       {'Type': 'NUM'}
var1_m2_Transform   dict    1       {'Type': 'CAT'}
var2_m2_Transform   dict    1       {'Type': 'CAT'}
var2_m3_Transform   dict    1       {'Type': 'CAT'}
var3_m3_Transform   dict    1       {'Type': 'CAT'}
var5_m3_Transform   dict    1       {'Type': 'CAT'}
var9_m3_Transform   dict    1       {'Type': 'CAT'}
var8_m3_Transform   dict    1       {'Type': 'CAT'}

Whereas I want it in the below format:

Key                 Type    Size    Value
ttl_change_3m       str     1       NUM
ttl_change_6m       str     1       NUM
base_rev_3m         str     1       NUM
csc_ttl_6m          str     1       NUM
var1_m2_Transform   str     1       CAT
var2_m2_Transform   str     1       CAT
var2_m3_Transform   str     1       CAT
var3_m3_Transform   str     1       CAT
var5_m3_Transform   str     1       CAT
var9_m3_Transform   str     1       CAT
var8_m3_Transform   str     1       CAT

Also , I think I am doing too many steps to get to the result and there might be shorter/efficient version of code to do this.

Can someone please help me with this?

Upvotes: 3

Views: 62

Answers (1)

Pyd
Pyd

Reputation: 6159

I think you need np.where,

 import numpy as np
 import pandas as pd
 df=pd.DataFrame({'Key':pd.Series(num_cols+cat_cols)})
 df['Value']=np.where(df['Key'].isin(cat_cols), 'CAT','NUM')
 #print(df)
    Key                 Value
#   ttl_change_3m       NUM
#   ttl_change_6m       NUM
#   base_rev_3m         NUM
#   csc_ttl_6m          NUM
#   var1_m2_Transform   CAT
#   var2_m2_Transform   CAT
#   var2_m3_Transform   CAT
#   var3_m3_Transform   CAT
#   var5_m3_Transform   CAT
#   var8_m3_Transform   CAT
#   var9_m3_Transform   CAT

Upvotes: 1

Related Questions