Reputation: 229
I have a file and when I open it, it prints out some paragraphs. I need to join these paragraphs together with a space to form one big body of text.
for e.g.
for data in open('file.txt'):
print data
has an output like this:
Hello my name is blah. What is your name?
Hello your name is blah. What is my name?
How can the output be like this?:
Hello my name is blah. What is your name? Hello your name is blah. What is my name?
I've tried replacing the newlines with a space like so:
for data in open('file.txt'):
updatedData = data.replace('\n',' ')
but that only gets rid of the empty lines, it doesn't join the paragraphs
and also tried joining like so:
for data in open('file.txt'):
joinedData = " ".join(data)
but that separates each character with a space, while not getting rid of the paragraph format either.
Upvotes: 15
Views: 80702
Reputation: 846
If anyone's doing this in pandas, where you have all lines in a particular column, you can use the following:
import pandas as pd
# line is the name of the column containing all lines in df
df.line.to_string()
Upvotes: 0
Reputation: 1121524
You are looping over individual lines and it is the print
statement that is adding newlines. The following would work:
for data in open('file.txt'):
print data.rstrip('\n'),
With the trailing comma, print
doesn't add a newline, and the .rstrip()
call removes just the trailing newline from the line.
Alternatively, you need to pass all read and stripped lines to ' '.join()
, not each line itself. Strings in python are sequences to, so the string contained in line is interpreted as separate characters when passed on it's own to ' '.join()
.
The following code uses two new tricks; context managers and a list comprehension:
with open('file.txt') as inputfile:
print ' '.join([line.rstrip('\n') for line in inputfile])
The with
statement uses the file object as a context manager, meaning the file will be automatically closed when we are done with the block indented below the with
statement. The [.. for .. in ..]
syntax generates a list from the inputfile
object where we turn each line into a version without a newline at the end.
Upvotes: 7
Reputation: 250901
You could use str.join
:
with open('file.txt') as f:
print " ".join(line.strip() for line in f)
line.strip()
will remove all types of whitespaces from both ends of the line.
You can use line.rstrip("\n")
to remove only the trailing "\n"
.
If file.txt
contains:
Hello my name is blah. What is your name?
Hello your name is blah. What is my name?
Then the output would be:
Hello my name is blah. What is your name? Hello your name is blah. What is my name?
Upvotes: 28