Reputation: 23
I'm trying to write a very simple ruby script that opens a text file, removes the \n from the end of lines UNLESS the line starts with a non-alphabetic character OR the line itself is blank (\n).
The code below works fine, except that it skips all of the content beyond the last \n line. When I add \n\n to the end of the file, it works perfectly. Examples: A file with this text in it works great and pulls everything to one line:
Hello
there my
friend how are you?
becomes Hello there my friend how are you?
But text like this:
Hello
there
my friend
how
are you today
returns just Hello
and There
, and completely skips the last 3 lines. If I add 2 blank lines to the end, it will pick up everything and behave as I want it to.
Can anybody explain to me why this happens? Obviously I know I can fix this instance by appending \n\n
to the end of the source file at the start, but that doesn't help me understand why the .gets
isn't working as I'd expect.
Thanks in advance for any help!
source_file_name = "somefile.txt"
destination_file_name = "some_other_file.txt"
source_file = File.new(source_file_name, "r")
para = []
x = ""
while (line = source_file.gets)
if line != "\n"
if line[0].match(/[A-z]/) #If the first character is a letter
x += line.chomp + " "
else
x += "\n" + line.chomp + " "
end
else
para[para.length] = x
x = ""
end
end
source_file.close
fixed_file = File.open(destination_file_name, "w")
para.each do |paragraph|
fixed_file << "#{paragraph}\n\n"
end
fixed_file.close
Upvotes: 2
Views: 804
Reputation: 416
Your problem lies in the fact you only add your string x to the para array if and only if you encounter an empty line ('\n'). Since your second example does not contain the empty line at the end, the final contents of x are never added to the para array.
The easy way to fix this without changing any of your code, is add the following lines after closing your while loop:
if(x != "")
para.push(x)
end
I would prefer to add the strings to my array right away rather then appending them onto x until you hit an empty line, but this should work with your solution.
Also,
para.push(x)
para << x
both read much nicer and look more straightforward than
para[para.length] = x
That one threw me off for a second, since in non-dynamic languages, that would give you an error. I advise using one of those instead, simply because it's more readable.
Upvotes: 2
Reputation: 54984
It's easier to use a multiline regex. Maybe:
source_file.read.gsub(/(?<!\n)\n([a-z])/im, ' \\1')
Upvotes: 1
Reputation: 12349
Your code is like a c code to me, ruby way should be this, which substitutes your above 100 lines.
File.write "dest.txt", File.read("src.txt")
Upvotes: 1