Paul Adamson
Paul Adamson

Reputation: 2021

How to detect duplicate artifacts in artifactory

I know that artifactory uses checksum based storage and will only store one copy of an artifact even if I upload multiple identical ones under different names.

As I have many projects with version-anonymous but probably identical jars, I would like to know if there is any way of getting artifactory to tell me which artifacts are referenced under multiple ids.

Upvotes: 3

Views: 2870

Answers (3)

Robert Bratton
Robert Bratton

Reputation: 435

Here's SQL to run against the PostGreSQL database. I haven't tried it with any other database.

select sha1_actual, node_name, node_path, repo, *
from nodes
where sha1_actual in 
(
    select sha1_actual 
    from nodes
    where node_type != 0
    group by sha1_actual
    having count(1) > 1
)
order by sha1_actual

Upvotes: 1

Angelos Karageorgiou
Angelos Karageorgiou

Reputation: 27

#!/bin/bash

#
# search in artifactory, lists duplicates [email protected]
#

search=$1

if [ "X$search" == "X" ]
then
    echo "$0 <search item>"
    exit 1
else
    search=`echo $search |sed -e 's/ /\%20/gi' `
    search="*${search}*"
fi

USER=`whoami`
PASS=${PASS:-somepass}
CREDS=${USER}:${PASS}
ARTIFACTORY=https://artifactory.somesite.com/artifactory

curl -s -u ${CREDS} -o search.txt ${ARTIFACTORY}/api/search/artifact\?name=${search}

echo "List of all uris is in search.txt"
echo "All instances of $search follow"
echo "---------------------------------"
grep $search search.txt | grep -v pom | awk '{ print $3}'  | xargs -i  basename {} | sort | uniq -c | sort -rn

Upvotes: 0

noamt
noamt

Reputation: 7825

While Artifactory has no existing feature that provides this info, it is actually quite easy to achieve with a small script that utilizes Artifactory's REST-API.

You can for example, write a tree walker (using the Folder Info resource) that maps checksums to files (file checksum can be obtained using the File Info resource).

Or if you use the Pro version of Artifactory, you can retrieve a list of all artifacts within a repository using the File List resource

Upvotes: 4

Related Questions