Reputation: 19
I got a list of sentences like below:
They are some sentences I extracted from job descriptions. I want to extract information like: degree type, major, required or preferred. There are
The result should be like : { degree: Bachelor, major : Computer Science, required: True }
Thers are no obvious rules in these sentences. How can I achieve this goal?
Bachelor ’ s degree in Computer Science or equivalent
Pursuing B.S. or advanced degree in computer science or related technical/engineering degree .
Bachelor 's Degree in Computer Science or equivalent experience
Youre educated ( BS/MS in Computer Science or other technical degree ) .
•BS in Computer Science , Digital Media or similar technical degree with 3 + years of experience
· Bachelors degree .
Bachelor 's degree in computer science , design or related field
Ability to absorb , master and leverage emerging technologies
BA/BS degree or equivalent practical experience
Education Required : Bachelors Degree
• Bachelor 's degree in related field , OR four ( 4 ) years of experience in a directly related field .
Upvotes: 0
Views: 823
Reputation: 691
Another suggestion to do this would be:
Hope this helps.
Upvotes: 0
Reputation: 280
So you are dealing with unstructured data, I hope using following steps you may reach to a decent accuracy level.
Overview of hierarchal rules:
Try to modify these rules on each iteration of code. Keep adding new rules. This is just the basic approach, I believe that if you do some iterations over your methodology, you will be able to extract information.
Upvotes: 1
Reputation: 6039
You probably need to gather a list of majors and degrees (for example : http://en.wikipedia.org/wiki/List_of_tagged_degrees ) to extract the degree and major. Then based on some general rules (or designing a classifier decide on "required" or "not required").
Upvotes: 0