Calculate correlation on dict type variables

Question

I have a dataframe named hyperparam_df which looks like the following:


      repo_name                                      file_name  \
0     DeepCoMP                    deepcomp/util/simulation.py   
1     DeepCoMP                     deepcomp/util/constants.py   
2     DeepCoMP                     deepcomp/util/env_setup.py   
3     DeepCoMP                           deepcomp/util/cli.py   
4          cm3                                alg/networks.py   

7154      flow             flow/envs/multiagent/ring/accel.py   
7155      flow  flow/envs/multiagent/ring/wave_attenuation.py   
7156      flow                        flow/envs/ring/accel.py   
7157      flow            flow/envs/ring/lane_change_accel.py   
7158      flow             flow/envs/ring/wave_attenuation.py   

                   hyperparam_name  
0     {'agent_str': 'multi-sep-nns', 'row': 2, 'trai...  
1     {'LOG_ROUND_DIGITS': 3, 'EPSILON': 1e-16, 'FAI...  
2                        {'id': 1, 'log_metrics': True}  
3     {'agent': 'central', 'alg': 'ppo', 'workers': ...  
4                                    {'embed_dim': 128}  

7154  {'max_accel': 1, 'max_decel': 1, 'target_veloc...  
7155  {'max_accel': 1, 'max_decel': 1, 'max_speed': ...  
7156  {'max_accel': 3, 'max_decel': 3, 'target_veloc...  
7157  {'max_accel': 3, 'max_decel': 3, 'lane_change_...  
7158  {'max_accel': 1, 'max_decel': 1, 'v0': 30, 's0...

My data consists of multiple repositories (repo_name), each repository has multiple files (file_name) and hyperparam_name defines the hyperparameter configuration for every file.

I am trying to find the correlation between the hyperparameters, but I am a bit unsure on how should I breakdown the hyperparam_name column into seperate columns? I would also need to convert the categorical hyperparameters into numerical ones. I haven't dealt with such a scenario before, so not so sure how to go about this. Any suggestions or ideas on how I could do this will be appreciated!

Calculate correlation on dict type variables

Answers (1)

Related Questions