Reputation: 912
I am trying to get the list of colours in an image to be listed in separate cells in an excel sheet along with the count and percentage
I have managed to transfer data to excel sheet but it was all combined in one cell. I have searched how to do it but now I am getting
TypeError: unhashable type: 'slice'
Here is what I have tried
import pandas as pd
from PIL import Image
from collections import Counter
import prettytable
img = Image.open("Original 2.JPG")
size = w, h = img.size
data = img.load()
colors = []
for x in range(w):
for y in range(h):
color = data[x, y]
hex_color = '#'+''.join([hex(c)[2:].rjust(2, '0') for c in color])
colors.append(hex_color)
#pt = prettytable.PrettyTable(['Color', 'Count', 'Percentage'])
total = w * h
for color, count in Counter(colors).items():
percent = int(count/total * 100)
if percent > 0:
# pt.add_row([color, count, percent])
# print(pt, total)
final = {'colors': [colors],
'count': [count],
'percent': [percent]
}
df = pd.DataFrame()
df['colors'] = final[0::3] <--------------Error returning from here
df['count'] = final[1::3]
df['percent'] = final[2::3]
df.to_excel(r'C:\Users\Ahmed\Desktop\Project\export_dataframe.xlsx',
index=False, header=True)
Upvotes: 1
Views: 74
Reputation: 4860
The code within your for loop really doesn't make sense.
This if statement is always true, which makes it redundant:
Edit: Ignore this. The call to if percent > 0:
int
can return 0
, which would cause this if statement to be false.
Everything below that, including writing to an excel file, is executed for every colour. Presumably this is an indentation error.
df['colors'] = final[0::3] <--------------Error returning from here
final
is a dict
. You need to access it using one of its 3 keys. For example: final['colors']
, which would return the entire list of pixel colours, including duplicates.
What you want can be achieved with this code:
import pandas as pd
from PIL import Image
from collections import Counter
import prettytable
img = Image.open("Original 2.JPG")
size = w, h = img.size
data = img.load()
colors = []
for x in range(w):
for y in range(h):
color = data[x, y]
hex_color = '#'+''.join([hex(c)[2:].rjust(2, '0') for c in color])
colors.append(hex_color)
#pt = prettytable.PrettyTable(['Color', 'Count', 'Percentage'])
total = w * h
colors, counts = zip(*Counter(colors).items())
percentages = tuple(count / total for count in counts)
df = pd.DataFrame()
df['colors'] = colors
df['count'] = counts
df['percent'] = percentages
df.to_excel(r'C:\Users\Ahmed\Desktop\Project\export_dataframe.xlsx',
index=False, header=True)
The 2 key lines are:
colors, counts = zip(*Counter(colors).items())
percentages = tuple(count / total for count in counts)
The first line creates 2 tuples
with all the unique colours and their counts. A tuple is basically an immutable list
. zip
combined with the *
unpacking operator is used to transform the key and value pairs from Counter(colors).items()
to their own separate tuples.
The second line creates a tuple from a generator expression which gives us the percentages of all the colours.
colors
, counts
, and percentages
are all aligned so the same index refers to the same colour.
Upvotes: 1
Reputation: 686
I decided to use lists over dictionary. I have not seen any particular advantage. I also removed both removed int()
from:
percent = int(count/total * 100)
and
if percent > 0:
Because if you have image with many hues of colours, the condition would never be passed.
The complete code is as follows:
import pandas as pd
from PIL import Image
from collections import Counter
img = Image.open("Original 2.JPG")
size = w, h = img.size
data = img.load()
colors = []
for x in range(w):
for y in range(h):
color = data[x, y]
hex_color = '#'+''.join([hex(c)[2:].rjust(2, '0') for c in color])
colors.append(hex_color)
total = w * h
color_hex = []
color_count = []
color_percent =[]
df = pd.DataFrame()
for color, count in Counter(colors).items():
percent = count/total * 100 # Do not make it int. Majority of colors are < 1%, unless you want >= 1%
color_hex.append(color)
color_count.append(count)
color_percent.append(percent)
df['color'] = color_hex
df['count'] = color_count
df['percent'] = color_percent
df.to_excel(r'C:\Users\Ahmed\Desktop\Project\export_dataframe.xlsx',
index=False, header=True)
Upvotes: 2