Reputation: 261
I have a question concerning coding in python for a homework assignment, but I need to mention first that I've never coded in python before. The assignment is supposed to get us used to basics, so apologies ahead of time for the lack of knowledge (and really long post).
Our job is to modify the file randline.py (given here in its originality):
import random, sys
from optparse import OptionParser
class randline:
def __init__(self, filename):
f = open(filename, 'r')
self.lines = f.readlines()
f.close()
def chooseline(self):
return random.choice(self.lines)
def main():
version_msg = "%prog 2.0"
usage_msg = """%prog [OPTION]... FILE
Output randomly selected lines from FILE."""
parser = OptionParser(version=version_msg,
usage=usage_msg)
parser.add_option("-n", "--numlines",
action="store", dest="numlines", default=1,
help="output NUMLINES lines (default 1)")
options, args = parser.parse_args(sys.argv[1:])
try:
numlines = int(options.numlines)
except:
parser.error("invalid NUMLINES: {0}".
format(options.numlines))
if numlines < 0:
parser.error("negative count: {0}".
format(numlines))
if len(args) != 1:
parser.error("wrong number of operands")
input_file = args[0]
try:
generator = randline(input_file)
for index in range(numlines):
sys.stdout.write(generator.chooseline())
except IOError as (errno, strerror):
parser.error("I/O error({0}): {1}".
format(errno, strerror))
if __name__ == "__main__":
main()
We need to make it so that the program can take in multiple files, instead of just one, but the program still has to treat all the files as if they were one large file. Also, if one of the files does not end in a new line, we have to append a new line. I tried this, but the issue is that this adds a new line to the end of each file, regardless of if it ends in a new line originally or not. Plus my syntax is wrong to begin with. I get an error everytime I try to run the modified program.
And I also have to add new options. I have unique working, but another option I'm supposed to make is without-replacement, which makes it so that the each output line only appears at max the number of times it appeared as input (if we don't use the -u option, the only time the output can be a duplicate is if it was a duplicate in the input file to begin with). I know my method is wrong, since sets automatically get rid of all duplicates and I only want it so the output lines write without replacement. But I have no idea what else I can use.
import random, sys, string
from optparse import OptionParser
version_msg = "%prog 2.0"
usage_msg = """%prog [OPTION]... FILE
Output randomly selected lines from FILE"""
parser = OptionParser(version=version_msg,
usage=usage_msg)
parser.add_option("-n", "--numlines",
action="store", dest="numlines", default=1,
help="output NUMLINES lines (default 1)")
parser.add_option("-u", "--unique", action="store_true",
dest="unique", default=False,
help="ignores duplicate lines in a file")
parser.add_option("-w", "--without-replacement", action="store_true",
dest="without", default=False,
help="prints lines without replacement")
options, args = parser.parse_args(sys.argv[1:])
without = bool(options.without)
unique = bool(options.unique)
try:
numlines = int(options.numlines)
except:
parser.error("invalid NUMLINES: {0}".
format(options.numlines))
def main():
if numlines < 0:
parser.error("negative count: {0}".
format(numlines))
##Here is one of the major changes
input_file = args[0]
count = 0
while (count < len(args)-1):
input_file = input_file + '\n' + args[count + 1]
count = count + 1
##And here
try:
generator = randline(input_file)
for index in range(numlines):
if options.without:
line = generator.chooseline()
if line not in no_repeat:
sys.stdout.write(line)
no_repeat.add(line)
else:
sys.stdout.write(generator.chooseline())
except IOError as (errno, strerror):
parser.error("I/O error({0}): {1}".
format(errno, strerror))
class randline:
def __init__(self, filename):
if unique:
uniquelines = set(open(filename).readlines())
f = open(filename, 'w').writelines(set(uniquelines))
f = open(filename, 'r')
if without:
countlines = len(f.readlines())
if (countlines < numlines):
parser.error("too few lines in input".
format(without))
self.lines = f.readlines()
f.close()
def chooseline(self):
return random.choice(self.lines)
if __name__ == "__main__":
main()
To sum, I can't get it to read multiple files properly (while still treating all the files as one long file) and the without-replacement option doesn't work correctly either.
EDIT: Ah, I realized since I was passing file names in the argument list so even though they're only text files, they're not treated as strings. I tried to change it, but it still doesn't work exactly:
input_file = args[0]
count = 0
content = open(args[0]).read()
while (count < len(args) - 1):
content = content + open(args[count + 1]).read()
count = count + 1
open(input_file, 'wb').write(content)
try:
generator = randline(input_file)
It keeps adding an extra newline between the two files. I want to the files joined line by line, but I get a blank line between where the first file ends and the second one begins.
EDIT 2.0: Oh wait, got it. Whoops. I just need help with the without-replacement option. I think I should split it line by line and store it in a list to check against every time? Is there a more efficient way (only using the modules I've already written, we can't use anything else)
Upvotes: 0
Views: 470
Reputation: 425
First of all, you need to read the first input file. So you need to open() the first input file you read as well. To add the newline, modify the while loop to check if the file (after you open() it) has a new line at the end, and if it doesn't, put one there.
In order to implement the without replacement option, you need some way to check if you already have read a line. In order to split a file you've read in by lines, just use
input.split('\n')
Your randline constructor is getting passed the contents of the files, so what it's trying to do in the constructor doesn't make sense.
Sorry if this answer doesn't seem to make sense; I've gone through and tried to point out different things you need to fix. I can't comment on stuff because I have no reputation, so I have to post an answer :. Let me know if this helps!
To do without replacement, i would make a dictionary that holds strings as keys, and ints as values. Loop through all the input text, and every time you see a line, add it to the dictionary and increment the associated integer. Then, when you call chooseline, check to make sure that the value in the dictionary for the string you got from picking a random line is greater than 0. If it is, decrement the associated value, and then return the string you picked. Otherwise, pick a new line.
Reply to Edit 2.0: You can do that. Then, everytime you pick a random line, you remove it from the list. I think that would work.
Upvotes: 2