Extracting information from text in python

Question

I am new to the text mining. I have a CSV file. I need to go through each line and extract some information then write them into another CSV file. I am looking for specific information which I have in a dictionary. Consider below sentence:

"the application version is 1.8.2 and the variable skt.len passes the required information. file ReadMe.txt has the specifications."

My dictionary is: ["application version", "variable", "file"]

I need to extract:

application version: 1.8.2
variable: skt.len
file: ReadMe.txt

What is the best way to extract such information from text? I am playing with NLTK and StanfordCoreNLP features. But, I could not extract the information yet. I am thinking to use regex to extract the application version. Any idea?

PS: I know that this may make the task more complicated. But, sentences in each line of the CSV file may have different structures. For example: "application version" in one line, may be "app version" in another line. Or "file" in one line may be "filename" in another line.

Extracting information from text in python

Answers (1)

Related Questions