Reputation: 129
I have a Python script which I'm trying to use to print duplicate numbers in the Duplicate.txt
file:
newList = set()
datafile = open ("Duplicate.txt", "r")
for i in datafile:
if datafile.count(i) >= 2:
newList.add(i)
datafile.close()
print(list(newList))
I'm getting the following error, could anyone help please?
AttributeError: '_io.TextIOWrapper' object has no attribute 'count'
Upvotes: 1
Views: 1899
Reputation: 1025
You are looking to use the list.count()
method, instead you've mistakenly called it on a file object. Instead, lets read the file, split it's contents into a list, and then obtain the count of each item using the list.count()
method.
# read the data from the file
with open ("Duplicate.txt", "r") as datafile:
datafile_data = datafile.read()
# split the file contents by whitespace and convert to list
datafile_data = datafile_data.split()
# build a dictionary mapping words to their counts
word_to_count = {}
unique_data = set(datafile_data)
for data in unique_data:
word_to_count[data] = datafile_data.count(data)
# populate our list of duplicates
all_duplicates = []
for x in word_to_count:
if word_to_count[x] > 2:
all_duplicates.append(x)
Upvotes: 0
Reputation: 140276
The error in your code is trying to apply count
on a file handle, not on a list
.
Anyway, you don't need to count the elements, you just need to see if the element already has been seen in the file.
I'd suggest a marker set to note down which elements already occured.
seen = set()
result = set()
with open ("Duplicate.txt", "r") as datafile:
for i in datafile:
# you may turn i to a number here with: i = int(i)
if i in seen:
result.add(i) # data is already in seen: duplicate
else:
seen.add(i) # next time it occurs, we'll detect it
print(list(result)) # convert to list (maybe not needed, set is ok to print)
Upvotes: 1
Reputation: 365975
The problem is exactly what it says: file objects don't know how to count anything. They're just iterators, not lists or strings or anything like that.
And part of the reason for that is that it would potentially be very slow to scan the whole file over and over like that.
If you really need to use count
, you can put the lines into a list first. Lists are entirely in-memory, so it's not nearly as slow to scan them over and over, and they have a count
method that does exactly what you're trying to do with it:
datafile = open ("Duplicate.txt", "r")
lines = list(datafile)
for i in lines:
if lines.count(i) >= 2:
newList.add(i)
datafile.close()
However, there's a much better solution: Just keep counts as you go along, and then keep the ones that are >= 2. In fact, you can write that in two lines:
counts = collections.Counter(datafile)
newList = {line for line, count in counts.items() if count >= 2}
But if it isn't clear to you why that works, you may want to do it more explicitly:
counts = collections.Counter()
for i in datafile:
counts[i] += 1
newList = set()
for line, count in counts.items():
if count >= 2:
newList.add(line)
Or, if you don't even understand the basics of Counter
:
counts = {}
for i in datafile:
if i not in counts:
counts[i] = 1
else:
counts[i] += 1
Upvotes: 4
Reputation: 7753
Your immediate error is because you're asking if datafile.count(i)
and datafile
is a file, which doesn't know how to count its contents.
Your question is not about how to solve the larger problem, but since I'm here:
Assuming Duplicate.txt
contains numbers, one per line, I would probably read each line's contents into a list and then use a Counter to count the list's contents.
Upvotes: 0