Reputation: 73
I need to write a program in Python that compares two parallel lists to grade a multiple choice exam. One list has the exam solution and the second list has a student's answers. The question number for each missed question is to be stored in a third list using the natural index numbers. The solution must use indexing.
I keep getting an empty list returned for the third list. All help much appreciated!
def main():
exam_solution = ['B', 'D', 'A', 'A', 'C', 'A', 'B', 'A', 'C', 'D', 'B', 'C',\
'D', 'A', 'D', 'C', 'C', 'B', 'D', 'A']
student_answers = ['B', 'D', 'B', 'A', 'C', 'A', 'A', 'A', 'C', 'D', 'B', 'C',\
'D', 'B', 'D', 'C', 'C', 'B', 'D', 'A']
questions_missed = []
for item in exam_solution:
if item not in student_answers:
questions_missed.append(item)
Upvotes: 1
Views: 245
Reputation: 12765
One more solution comes to my mind. I put in in a separate answers as it is "special"
Using numpy this task can be accomplished by:
import numpy as np
exam_solution = np.array(exam_solution)
student_answers = np.array(student_answers)
(exam_solution!=student_answers).nonzero()[0]
With numpy-arrays, elementwise comparison is possible via ==
and !=
. .nonzero()
returns the indices of the array elements that are not zero. That's it.
Timing is really interesting now. For your 19-elements lists, performances are (N=19,repetitions=100,000):
list comprehension: 0.904024521544
loop: 0.936516107421
numpy: 0.349371968612
This is already a factor of almost 3. Nice, but not amazing.
But when I increase the size of your lists by a factor of 100, I get (N=19*100=1900, repetitions=1000):
list comprehension: 0.866544042939
loop: 1.04464069977
numpy: 0.0334220694495
Now we have a factor of 26 or 31 - that is definitely a lot.
Probably, performance won't be your problem, but, nevertheless, I thought it's worth pointing out.
Upvotes: 0
Reputation: 12765
questions_missed = [i for i, (ex,st) in enumerate(zip(exam_solution, student_answers)) if ex != st]
or alternatively, if you prefer loops over list comprehensions:
questions_missed = []
for i, (ex,st) in enumerate(zip(exam_solution, student_answers)):
if ex != st:
questions_missed.append(i)
Both give [2,6,13]
Explanation:
enumerate
is a utility function that returns an iterable object which yields tuples of indices and values, it can be used to, loosely speaking, "have the current index available during an iteration".
Zip creates a list of tuples, containing corresponding elements from two or more iterable objects (in your case lists).
I'd prefer the list comprehension version.
If I add some timing code, I see that performance doesn't really differ here:
def list_comprehension_version():
questions_missed = [i for i, (ex,st) in enumerate(zip(exam_solution, student_answers)) if ex != st]
return questions_missed
def loop_version():
questions_missed = []
for i, (ex,st) in enumerate(zip(exam_solution, student_answers)):
if ex != st:
questions_missed.append(i)
return questions_missed
import timeit
print "list comprehension:", timeit.timeit("list_comprehension_version", "from __main__ import exam_solution, student_answers, list_comprehension_version", number=10000000)
print "loop:", timeit.timeit("loop_version", "from __main__ import exam_solution, student_answers, loop_version", number=10000000)
gives:
list comprehension: 0.895029446804
loop: 0.877159359719
Upvotes: 5
Reputation: 3923
A solution based on iterators
questions_missed = list(index for (index, _)
in filter(
lambda (_, (answer, solution)): answer != solution,
enumerate(zip(student_answers, exam_solution))))
For the purists, note that you should import the equivalents of zip
and filter
(izip
and ifilter
) from itertools
.
Upvotes: 0