Mine
Mine

Reputation: 861

How to join lines after a condition for a text file-Python?

I have text file with the format below. We have multiple "context" parts with text consisting of multiple lines and topic (a one line topic). Then multiple questions with different ids about the context paragraph. I want to store context in a list. Where each context is an element of the list. My method was to take all the lines between lines that start with "context" and starts with "topic". However, once I set the condition that I want the lines between context and topic I can not join the different contexts into same string. Below is my code.

context : 
|

topic: 

|
question: 

answer: 

id: 
|
question: 

answer: 

id: 

|

context: 
|

topic: 

|
question: 

answer: 

id: 
.
.
.

context =  []
f = open("example.txt","r")
context_line = True
for line in f:
  if not line.strip():
    continue
  str1 = ""
  if line.startswith("context"):
    context_line = True
  elif line.startswith("topic"):
    context_line = False
  if context_line:
    # Here how can I join the lines?
    str1 += line.rstrip("\n").lstrip("\ufeff").strip("|")
  context.append(str1)

Upvotes: 0

Views: 702

Answers (1)

Gianluca Micchi
Gianluca Micchi

Reputation: 1653

You can keep track of all the lines in the context and join them when the topic part starts:

context =  []
f = open("example.txt","r")
context_line = True
for line in f:
  if not line.strip():
    continue
  if line.startswith("context"):
    context_line = True
    str1 = []
  elif line.startswith("topic"):
    lines = ' '.join(str1)  # here you can choose how to join the lines
    context.append(lines)
    context_line = False
  if context_line:
    str1.append(line.rstrip("\n").lstrip("\ufeff").strip("|"))

On a side note, just be aware that this method doesn't make any check that the input files are correctly formatted. In particular, if a context section is not immediately followed by a topic section, it will not work as intended.

Upvotes: 1

Related Questions