Reputation: 357
My dataset looks like the info provided in the picture. This is a regression problem where I have to predict 'LOS' (last column). My dataset consists of around 2000 samples or rows. I would love to create more rows(synthetic-data) out of real data to improve my model result.
I found its quite easy for classification task but having difficulties for regression cases.
Any help in python environment would be really helpful.
Thanks in advance
Upvotes: 0
Views: 1974
Reputation: 4264
You could use SMOGN
From Documentation:
A Python implementation of Synthetic Minority Over-Sampling Technique for Regression with Gaussian Noise (SMOGN). Conducts the Synthetic Minority Over-Sampling Technique for Regression (SMOTER) with traditional interpolation, as well as with the introduction of Gaussian Noise (SMOTER-GN).
But take it look here before implementing the same.
Upvotes: 3