Creating a table which has sentences from a paragraph each on a row with Python

Question

I have an abstract which I've split to sentences in Python. I want to write to 2 tables. One which has the following columns: abstract id (which is the file number that I extracted from my document), sentence id (automatically generated) and each sentence of this abstract on a row. I would want a table that looks like this

abstractID  SentenceID   Sentence

a9001755    0000001      Myxococcus xanthus development is regulated by(1st sentence)

a9001755    0000002      The C signal appears to be the polypeptide product (2nd sentence)

and another table NSFClasses having abstractID and nsfOrg. How to write sentences (each on a row) to table and assign sentenceId as shown above?

This is my code:

import glob;
import re;
import json
org = "NSF Org";
fileNo = "File";
AbstractString = "Abstract";
abstractFlag = False;
abstractContent = []
path = 'awardsFile/awd_1990_00/*.txt';
files = glob.glob(path);
for name in files:
    fileA = open(name,'r');
    for line in fileA:
         if line.find(fileNo)!= -1:
             file = line[14:]
         if line.find(org) != -1:
             nsfOrg = line[14:].split()
    print file
    print nsfOrg
    fileA = open(name,'r')
    content = fileA.read().split(':')
    abstract = content[len(content)-1]
    abstract = abstract.replace('
','')
    abstract = abstract.split();
    abstract = ' '.join(abstract)
    sentences = abstract.split('.')
    print sentences
    key = str(len(sentences))
    print "Sentences--- "

Creating a table which has sentences from a paragraph each on a row with Python

Answers (1)

Related Questions