Adhiraj Chattopadhyay
Adhiraj Chattopadhyay

Reputation: 167

Problem with .splitting a line in a fashion when there is no delimiter

I have a text file ;

... Above in Table 5 , we understood the relationship between pressure and volume. It said ... and now we know ... . Table 9: represents the graph of x and y. Table 6 was all about force and it implications on objects....

Now I have written a code to extract the lines that have the word table in it;

with open file( <pathname + filename.txt>, 'r+') as f:
   k = f.readlines()
   for line in k:
     if ' Table ' in line:
         print(line)

Now I desire to print the output in a particular format;

(txt file name),(Table id),(Table content)

I do this by using the .split method of python;

x = 'Paper ID:' + filename.split('.')[0] + '|' + 'Table ID:' + line.split(':')[0] + '|' + 'Table Content:' + line.split(':')[1] + '|' 

Now,as you can see, I can separate the table id and table content where there is a delimiter ( :) after some . How do I do the same where there is no delimiter, i.e. for these lines;

Above in Table 5 , we understood the relationship between pressure and volume. It said ... and now we know .. Or In table 7 we saw....

?

Could anyone please help?

Upvotes: 1

Views: 49

Answers (1)

slybloty
slybloty

Reputation: 6516

You could search for the pattern Table <number> then split at that location.
You could use re.split(pattern, string, maxsplit=0, flags=0) or re.findall(pattern, string, flags=0)

re.split('Table [0-9]', line)[-1]

will give you what follows (the content).

re.findall('Table [0-9]', line)

will give you the table with its ID from which you can extract it.

Python documentation on re.split and re.findall

Upvotes: 1

Related Questions