Reputation: 69
Is there any way, in Python, of automatically detect the colors in a certain area of a PDF and either translate them to RGB or compare them to the legend and then get the color?
Upvotes: 5
Views: 8932
Reputation: 1974
Felipe's approach didn't work for me, but I came up with this:
#!/usr/bin/env python
# -*- Encoding: UTF-8 -*-
import minecart
colors = set()
with open("file.pdf", "rb") as file:
document = minecart.Document(file)
page = document.get_page(0)
for shape in page.shapes:
if shape.fill:
colors.add(shape.fill.color.as_rgb())
for color in colors: print color
This will print a neat list of all unique RGB values in the first page of your document (you could extend it to all pages, of course).
Upvotes: 4
Reputation: 3149
Depending on where you want to extract the information from, you can use minecart
. It has really robust support for colors and allows easy conversion to RGB. Though you can't input a coordinate and get the color value there, if you are trying to get color information from a shape you could do something like the following:
import minecart
doc = minecart.Document(open("my-doc.pdf", "rb"))
page = doc.get_page(0)
BOX = (.5 * 72, # left bounding box edge
9 * 72, # bottom bounding box edge
1 * 72, # right bounding box edge
10 * 72) # top bounding box edge
for shape in page.shapes:
if shape.check_in_bbox(BOX):
r, g, b = shape.fill.color.as_rgb()
# do stuff with r, g, b
[Disclaimer: I'm the author of minecart
]
Upvotes: 2