stachyra
stachyra

Reputation: 4603

How can I use Docker Registry HTTP API V2 to obtain a list of all repositories in Docker Hub?

An external organization that I work with has given me access to a private (auth token protected) docker registry, and eventually I would like to be able to query this registry, using docker's HTTP API V2, in order to obtain a list of all the repositories and/or images available in the registry.

But before I do that, I'd first like to get some basic practice with constructing these types of API queries on a public registry such as Docker Hub. So I've gone ahead and registered myself with a username and password on Docker Hub, and also consulted the API V2 documentation, which states that one may request an API version check as:

GET /v2/

or request a list of repositories as:

GET /v2/_catalog

Using curl, together with the username and password that I used in order to register my Docker Hub account, I attempt to construct a GET request at the command line:

stachyra> curl -u stachyra:<my_password> -X GET https://index.docker.io/v2/
{"errors":[{"code":"UNAUTHORIZED","message":"authentication required","detail":null}]}
stachyra> curl -u stachyra:<my_password> -X GET https://index.docker.io/v2/_catalog
{"errors":[{"code":"UNAUTHORIZED","message":"authentication required","detail":[{"Type":"registry","Class":"","Name":"catalog","Action":"*"}]}]}

where of course, in place of <my_password>, I substituted my actual account password.

The response that I had been expecting from this query was a giant json message, listing thousands of repository names, but instead it appears that the API is rejecting my Docker Hub credentials.

Question 1: Do I even have the correct URL (index.docker.io) for the docker hub registry? (I made this assumption in the first place based upon the status information returned by the command line tool docker info, so I have good reason to think it's correct.)

Question 2: Assuming I have the correct URL for the registry service itself, why does my query return an "UNAUTHORIZED" error code? My account credentials work just fine when I attempt to login via the web at hub.docker.com, so what's the difference between the two cases?

Upvotes: 24

Views: 40633

Answers (6)

Daniel
Daniel

Reputation: 3

Refer to Docker Hub API reference
You can create token with username and password on /v2/users/login, or you have PAT.
Access repo URL with token.

i.e. curl --header "Authorization:Bearer {token}" https://hub.docker.com/v2/repositories/library

Change parameter "page" and "page_size" to get more, or check the key "next" of returned json.

Upvotes: 0

beeeliu
beeeliu

Reputation: 109

Here's python code to do the very same. This can access both your organization and your own private repos.

Side note, I have another bunch of code that can access manifests, but only on private/public USER repos, but nor organizational level repos, anyone know why that is?

import requests

docker_username = ""
docker_password = ""
docker_organization = ""


auth_url = "https://hub.docker.com/v2/users/login/"
auth_data = {
    "username": docker_username,
    "password": docker_password
}
auth_response = requests.post(auth_url, json=auth_data)
auth_response.raise_for_status()
docker_hub_token = auth_response.json()["token"]

repositories_list = f"https://hub.docker.com/v2/repositories/{docker_username}/?page_size=100"
# repositories_list = f"https://hub.docker.com/v2/repositories/{docker_organization}/?page_size=100"
repos_headers = {
    "Authorization": f"JWT {docker_hub_token}"
}
repos_response = requests.get(repositories_list, headers=repos_headers)
repository_list = repos_response.json()["results"]
for repo in repository_list:
    namespace = repo["namespace"]
    repo_name = repo["name"]
    combined_name = f"{namespace}/{repo_name}"
    print(combined_name)

Upvotes: 0

starfry
starfry

Reputation: 9983

Here is an example program to read repositories from a registry. I used it as a learning aid with Docker Hub.

#!/bin/bash

set -e

# set username and password
UNAME="username"
UPASS="password"

# get token to be able to talk to Docker Hub
TOKEN=$(curl -s -H "Content-Type: application/json" -X POST -d '{"username": "'${UNAME}'", "password": "'${UPASS}'"}' https://hub.docker.com/v2/users/login/ | jq -r .token)

# get list of repos for that user account
REPO_LIST=$(curl -s -H "Authorization: JWT ${TOKEN}" 
https://hub.docker.com/v2/repositories/${UNAME}/?page_size=10000 | jq -r '.results|.[]|.name')

# build a list of all images & tags
for i in ${REPO_LIST}
do
  # get tags for repo
  IMAGE_TAGS=$(curl -s -H "Authorization: JWT ${TOKEN}" 
  https://hub.docker.com/v2/repositories/${UNAME}/${i}/tags/?page_size=10000 | jq -r '.results|.[]|.name')

  # build a list of images from tags
  for j in ${IMAGE_TAGS}
  do
    # add each tag to list
    FULL_IMAGE_LIST="${FULL_IMAGE_LIST} ${UNAME}/${i}:${j}"
  done
done

# output list of all docker images
for i in ${FULL_IMAGE_LIST}
do
  echo ${i}
done

(this comes from an article on Docker site that describes how to use the API.)

In essence...

  • get a token
  • pass the token as a header Authorization: JWT <token> with any API calls you make
  • the api call you want to use to list repositories is https://hub.docker.com/v2/repositories/<username>/

Upvotes: 11

Omer Sen
Omer Sen

Reputation: 59

I have modified https://stackoverflow.com/a/60549026/7281491 so i can search for any other user/org dockerhub image list:

#!/bin/bash

set -e

# User to search for
UNAME=${1}


# Put your own docker hub TOKEN.
# You can use pass command or 1password cli to store pat 
TOKEN=dckr_pat_XXXXXXXXXXXXXXXXXXXXXXXx


# get list of namespaces accessible by user (not in use right now)
#NAMESPACES=$(curl -s -H "Authorization: JWT ${TOKEN}" https://hub.docker.com/v2/repositories/namespaces/ | jq -r '.namespaces|.[]')

# get list of repos for that user account
REPO_LIST=$(curl -s -H "Authorization: JWT ${TOKEN}" https://hub.docker.com/v2/repositories/${UNAME}/?page_size=10000 | jq -r '.results|.[]|.name')

# build a list of all images & tags
for i in ${REPO_LIST}
do
  # get tags for repo
  IMAGE_TAGS=$(curl -s -H "Authorization: JWT ${TOKEN}" https://hub.docker.com/v2/repositories/${UNAME}/${i}/tags/?page_size=10000 | jq -r '.results|.[]|.name')

  # build a list of images from tags
  for j in ${IMAGE_TAGS}
  do
    # add each tag to list
    FULL_IMAGE_LIST="${FULL_IMAGE_LIST} ${UNAME}/${i}:${j}"
  done
done

# output list of all docker images
for i in ${FULL_IMAGE_LIST}
do
  echo ${i}
done

Sample output:

gitlab/gitlab-ce:latest
gitlab/gitlab-ce:nightly
gitlab/gitlab-ce:15.5.9-ce.0
gitlab/gitlab-ce:15.6.6-ce.0
gitlab/gitlab-ce:rc
gitlab/gitlab-ce:15.7.5-ce.0
gitlab/gitlab-ce:15.7.3-ce.0
gitlab/gitlab-ce:15.5.7-ce.0
gitlab/gitlab-ce:15.6.4-ce.0
gitlab/gitlab-ce:15.7.2-ce.0
gitlab/gitlab-ce:15.7.1-ce.0
gitlab/gitlab-ce:15.7.0-ce.0
gitlab/gitlab-ce:15.6.3-ce.0
gitlab/gitlab-ce:15.5.6-ce.0
gitlab/gitlab-ce:15.6.2-ce.0
gitlab/gitlab-ce:15.4.6-ce.0
gitlab/gitlab-ce:15.5.5-ce.0
.....

Upvotes: 0

Kanak Singhal
Kanak Singhal

Reputation: 3312

Do I even have the correct URL

  • "Docker" is a protocol, "DockerHub" is product that implements the Docker protocol but is not limited to it. Docker APIs are also implemented by other providers like:
    • GitLab (registry.gitlab.com)
    • GitHub CR (ghcr.io)
    • GCP GCR (gcr.io)
    • AWS ECR (public.ecr.aws & <account_id>.dkr.ecr..amazonaws.com)
    • Azure ACR (<registry_name>.azurecr.io)
  • index.docker.io hosts the Docker implementation by DockerHub.
  • hub.docker.com hosts the rich DockerHub specific APIs.
  • NOTE: DockerHub implements the generic Docker HTTP API V2 but it doesn't implement _catalog API from the generic API set.

why does my query return an "UNAUTHORIZED" error code?

In order to use the Docker V2 API, a JWT auth token needs to be generated from https://auth.docker.io/token for each call and that token has to be used as Bearer token in the DockerHub calls at index.docker.io

When we hit the DockerHub APIs like this: https://index.docker.io/v2/library/alpine/tags/list, it returns 401 with info on the missing pre-flight auth call. We look for www-authenticate response header in the failed request.

eg: www-authenticate: Bearer realm="https://auth.docker.io/token",service="registry.docker.io",scope="repository:library/alpine:pull",error="invalid_token"

This means, we need to explicitly call following API to obtain the auth token.

https://auth.docker.io/token?service=registry.docker.io&scope=repository:library/alpine:pull

The https://auth.docker.io/token works without any auth for public repos. To access a private repo, we need to add basic http auth to the request.

https://<username>:<password>@auth.docker.io/token?service=registry.docker.io&scope=repository:<repo>:pull

NOTE: auth.docker.io will generate a token even if the request is not valid (invalid creds or scope or anything). To validate the token, we can parse the JWT (eg: from jwt.io) and check access field in the payload, it should be containing requested scope references.

Upvotes: 24

hudac
hudac

Reputation: 2798

This site says we cannot :(

Dockerhub hosts a mix of public and private repositories, but does not expose a catalog endpoint to programmatically list them.

Upvotes: 3

Related Questions