Reputation: 21
I'm trying to write a model (Yolov3) to detect various musical symbols on a sheet of music. But all datasets suitable for this are built only on printed sheet music. Is there a way to somehow adapt the model to handwritten characters? Will pre-trainig darknet-53 help with this? If I train darknet-53 to recognize both handwritten and printed characters, what will this affect?
Yolov3 architecture: Yolov3
Upvotes: 1
Views: 60
Reputation: 82
I agree with the previous commenters.
You can start with converting the image to grayscale (in case handwritten notes a drawn in blue) and try a model trained on printed sheets for the recognition of 1) printed sheets and 2) handwritten sheets.
If you annotate a dataset of ~30-50 sheets, you can finetune a detector. Although it seems that you might need a very large dataset to train a high quality detector (given the variation of different music sheets), unless you want to focus on a rather controlled setting. A possible option could be to create a semi-synthetic dataset of handwritten notes by replacing each printed note by one of its handwritten images, but it might be rather hard to extend it to complex music sheets.
If you train on both printed and handwritten, it might also work.
In perspective maybe an approach like Cycle GAN or similar can help to generate handwritten examples from examples of printed and handwritten music sheets (no additional annotation required).
And for the position detection you can either
Start with some simple examples, where the notes have no or little intersection.
Literature could also help if you are not familiar with the topic, depending on how far you want to go.
Good luck!
Upvotes: 0