Improving prediction accuracy in Bayesian Causal Network

Question

I would like to determine the causes of an unexpected outcome (or anamoly) in a thermodynamic process. I have continuous data of the associated variables and trying to make use of 'Bayesian Network (BN)' for the determination of causality relationships. For this purpose, I used a library called 'Causalnex' in Python.

I have followed the tutorial section of this library to build the DAG,BN model and everything works fine upto the step of predictions. The prediction results of minority/less majority classes have an accuracy of around 60-70% (80-90% with SMOTE/SMOTETomek and a particular random state) whereas a stable accuracy of more than 90% is expected. I have implemented following data-preprocessing steps.

Ensuring no missing/NaN values
Discretization (only it is supported by the library)
SMOTE/SMOTETomek for data balancing
Various train/test size combinations

I am struggling to figure out the ways to optimize the model. I could not find any supportive material in Internet for the same.

Are there any Guidelines or 'Best practices' of data pre-processing techniques and dataset requirements that particulary work for this library/ BN model? Could you please suggest any troubleshooting methods to identify the causes of low accuracy/metrics? Perhaps a misunderstood node-node causal relationship in DAG causes mediocre accuracy?

Any ideas/literature/other suitable library regarding this would be of great help!

Improving prediction accuracy in Bayesian Causal Network

Answers (1)

Related Questions