Reputation: 11
I am running dlp job inspections from google cloud storage and i was wondering if there is a method or way to get the full inspection results instead of the summary just the same way as inspecting external files? Here is a code snippet of how i am getting my inspection results when scanning external and local files:
# Print out the results.
results = []
if response.result.findings:
for finding in response.result.findings:
finding_dict = {
"quote": finding.quote if "quote" in finding else None,
"info_type": finding.info_type.name,
"likelihood": finding.likelihood.name,
"location_start": finding.location.byte_range.start,
"location_end": finding.location.byte_range.end
}
results.append(finding_dict)
else:
print("No findings.")
The output looks like this:
{
"quote": "gitlab.com",
"info_type": "DOMAIN_NAME",
"likelihood": "LIKELY",
"location_start": 3015,
"location_end": 3025
},
{
"quote": "www.makeareadme.com",
"info_type": "DOMAIN_NAME",
"likelihood": "LIKELY",
"location_start": 3107,
"location_end": 3126
}
But when scanning google cloud storage items using the dlp_get_job method with pub/sub this way:
def callback(message):
try:
if message.attributes["DlpJobName"] == operation.name:
# This is the message we're looking for, so acknowledge it.
message.ack()
# Now that the job is done, fetch the results and print them.
job = dlp_client.get_dlp_job(request={"name": operation.name})
if job.inspect_details.result.info_type_stats:
for finding in job.inspect_details.result.info_type_stats:
print(
"Info type: {}; Count: {}".format(
finding.info_type.name, finding.count
)
)
else:
print("No findings.")
# Signal to the main thread that we can exit.
job_done.set()
else:
# This is not the message we're looking for.
message.drop()
except Exception as e:
# Because this is executing in a thread, an exception won't be
# noted unless we print it manually.
print(e)
raise
The results are in this summary format:
Info type: LOCATION; Count: 18
Info type: DATE; Count: 12
Info type: LAST_NAME; Count: 4
Info type: DOMAIN_NAME; Count: 170
Info type: URL; Count: 20
Info type: FIRST_NAME; Count: 7
is there a way to get the detailed inspection results when scanning files on google cloud storage where i will get the quote, info_type, likelihood etc...without being summarized? I have tried a couple of methods and read through almost the docs but i am not finding anything that can help. I am running the inspection job on a windows environment with the dlp python client api. I would appreciate anyone's help with this;)
Upvotes: 1
Views: 660
Reputation: 126
Yes you can do this. Since the detailed inspection results can be sensitive, those are not kept in the job details/summary, but you can configure a job "action" to write the detailed results to a BigQuery table that you own/control. This way you can get access to the details of every finding (file or table path, column name, byte offset, optional quote, etc.).
The API details for that are here: https://cloud.google.com/dlp/docs/reference/rest/v2/Action#SaveFindings
Below are some more docs on how to query the detailed findings:
Also more details on DLP Job Actions: https://cloud.google.com/dlp/docs/concepts-actions
Upvotes: 2