Shamsul Masum
Shamsul Masum

Reputation: 357

Generating synthetic data out of real data (For Regression Problem)

enter image description here

My dataset looks like the info provided in the picture. This is a regression problem where I have to predict 'LOS' (last column). My dataset consists of around 2000 samples or rows. I would love to create more rows(synthetic-data) out of real data to improve my model result.

I found its quite easy for classification task but having difficulties for regression cases.

Any help in python environment would be really helpful.

Thanks in advance

Upvotes: 0

Views: 1974

Answers (1)

Parthasarathy Subburaj
Parthasarathy Subburaj

Reputation: 4264

You could use SMOGN

From Documentation:

A Python implementation of Synthetic Minority Over-Sampling Technique for Regression with Gaussian Noise (SMOGN). Conducts the Synthetic Minority Over-Sampling Technique for Regression (SMOTER) with traditional interpolation, as well as with the introduction of Gaussian Noise (SMOTER-GN).

But take it look here before implementing the same.

Upvotes: 3

Related Questions