Reputation: 301
I have a .txt file of which I want to extract the text that's between certain locations in Python. In order to do this I made a list of the index of the locations, so I can subtrat these locations in order to get the text. And append this to a different .txt file.
For instance (pseudocode):
indexList = [188, 1089, 364, 5697, 2, 5230, 2683, 2956]
with open(str(mytxtfile), 'r') as f:
for line in f:
subtract the text from location 2956 with the text of location 2683.
Now append this to a txt variable.
Loop this over for the entire list.
Upvotes: 1
Views: 430
Reputation: 1172
You can represent the start/end character positions using a list of tuples. You can read in the entire contents of the file into a string variable using fileDescriptor.read()
. You can then use string slicing to get the text at specific offsets, i.e. x = "abcdefg"; x[2:5] is "cde"
.
indexList = [(188, 1089), (364, 5697), (2, 5230), (2683, 2956)]
with open(str(mytxtfile), 'r') as f:
contents = f.read()
textFragments = []
for start,end in indexList:
textFragments.append(contents[start:end])
# textFragments[0] = text between positions 188 and 1089
# textFragments[1] = text between positions 364 and 5697
# so forth
If you want all of these fragments in one string variable, you can concatenate them using join
, like this:
''.join(textFragments)
Upvotes: 1