Reputation: 1621
Are there any recommended methods to integrate git with colab?
For example, is it possible to work off code from google source repositories or the likes?
Neither google drive nor cloud storage can be used for git functionality.
So I was wondering if there is a way to still do it?
Upvotes: 152
Views: 204342
Reputation: 41
2024 simple solution on the colab menu interface on how to share pull/save a colab file with your github.
Connect your Colab notebook to GitHub:
Upvotes: -1
Reputation: 1
I've recently made a script to automate the steps to clone private repo on https://github.com/tsunrise/colab-github/
You can run the following in colab
!wget -q https://raw.githubusercontent.com/tsunrise/colab-github/main/colab_github.py
import colab_github
colab_github.github_auth(persistent_key=True)
And then clone your repo using SSH method:
!git clone [email protected]:<your_username>/<your_private_repo>.git
Upvotes: 0
Reputation: 171
You can almost use this link: https://qiita.com/Rowing0914/items/51a770925653c7c528f9
as a summary of the above link you should do this steps:
1- connect your google colab runtime to your Google Drive using this commands:
from google.colab import drive
drive.mount('/content/drive')
It would need a authentication process. Do whatever it needs.
2- Set current directory
the path you want to clone the Git project there:
in my example:
path_clone = "drive/My Drive/projects"
%cd path_clone
don't forget to use !
in the beginning of cd
command.
3- Clone the Git project:
!git clone <Git project URL address>
now you would have the cloned Git project in projects
folder in you Google Drive (which is also connected to your Google Colab runtime machine)
4- Go to your Google Drive (using browser or etc) and then go to the "projects" folder and open the .ipynb
file that you want to use in Google Colab.
5- Now you have Google Colab runtime with the .ipynb
that you wanted to use which is also connected to your Google Drive and all cloned git files are in the Colab runtime's storage.
Note:
1- Check that your Colab runtime is connected to Google Drive. If it's not connected, just repeat the step #1 above.
2- Double check by using "pwd" and "cd" commands that the current directory
is related to the cloned git project in google Drive (step #2 above).
Upvotes: 10
Reputation: 1598
If you want to clone a private repository, the quickest way would be to create a personal access token and select only privileges that your application needs. Then clone command for GitHub would look like:
!git clone https://[email protected]/username/repository.git
Upvotes: 133
Reputation: 808
Update September 2021 — For security reasons, passwords are now deprecated for github usage. Please use the
Personal Access Token
instead — Go to github.com -> Settings ->Developer Settings -> Personal Access Token and generate a token for the required purpose. Use this in place of your password for all tasks mentioned along this tutorial!
For more details you can also see my article on Medium : https://medium.com/geekculture/using-git-github-on-google-colaboratory-7ef3b76fe61b
None of the answers provide a straight and direct answer like this one :
Probably this is the answer you are looking for..
Works on colab for both public and private repositories and don't change/skip any step: (Replace all {vars}
)
TL;DR Complete Process:
!git clone https://{your_username}:{your_password}@github.com/{destination_repo_username}/{destination_repo_projectname}.git
%cd /content/{destination_repo_username}
!git config --global user.name "{your_username}"
!git config --global user.email "{your_email_id}"
!git config --global user.password "{your_password}"
Make Your Changes and then run :
!git add .
!git commit -m "{Message}"
!git push
!git clone https://{your_username}:{your_password}@github.com/{destination_repo_username}/{destination_repo_projectname}.git
Change the directory to {destination_repo_username} using line magic command %cd
for jupyter notebooks.
%cd /content/{destination_repo_username}
Sanity Check to see if everything works perfectly!
!git pull
If no changes were made to the remote git repo after cloning, the following should be the displayed output :
Already up to date.
Similarly check the status of the staged/unstaged changes.
!git status
It should display this, with the default branch selected :
On branch main
Your branch is up to date with 'origin/main'.
nothing to commit, working tree clean
Check the previous commits you have made on the repo :
!git log -n 4
Outputs Git Commit IDs with Logs :
commit 18ccf27c8b2d92b560e6eeab2629ba0c6ea422a5 (HEAD -> main, origin/main, origin/HEAD)
Author: Farhan Hai Khan <[email protected]>
Date: Mon May 31 00:12:14 2021 +0530
Create README.md
commit bd6ee6d4347eca0e3676e88824c8e1118cfbff6b
Author: khanfarhan10 <[email protected]>
Date: Sun May 30 18:40:16 2021 +0000
Add Zip COVID
commit 8a3a12863a866c9d388cbc041a26d49aedfa4245
Author: khanfarhan10 <[email protected]>
Date: Sun May 30 18:03:46 2021 +0000
Add COVID Data
commit 6a16dc7584ba0d800eede70a217d534a24614cad
Author: khanfarhan10 <[email protected]>
Date: Sun May 30 16:04:20 2021 +0000
Removed sample_data using colab (testing)
Make changes from the local repo directory.
These might include, edditions, deletions, edits.
from google.colab import drive
drive.mount('/content/gdrive')
import shutil
# For a folder:
shutil.copytree(src_folder,des_folder)
# For a file:
shutil.copy(src_file,des_file)
# Create a ZipFile
shutil.make_archive(archive_name, 'zip', directory_to_zip)
Tell Git Who You Are?
!git config --global user.name "{your_username}"
!git config --global user.email "{your_email_id}"
!git config --global user.password "{your_password}"
Check if the remote url is set and configured correctly :
!git remote -v
If configured properly it should output the following :
origin https://{your_username}:{your_password}@github.com/{destination_repo_username}/{destination_repo_projectname}.git (fetch)
origin https://{your_username}:{your_password}@github.com/{destination_repo_username}/{destination_repo_projectname}.git (push)
You know what to do.
!git add .
!git commit -m "{Message}"
!git push
Enjoy!
Upvotes: 13
Reputation: 1044
Another solution based on answer from @Marafon Thiago:
ATENTION: In password with special caracters use the respective encoding of caracter.
Ex passwd = '@123'
you should type :passwd = '%40123'
from getpass import getpass
user = getpass('BitBucket user')
password = getpass('BitBucket password')
!git init
!git clone https://{user}:{password}@bitbucket.org/aqtechengenharia/aqtlibpy.git
Upvotes: 0
Reputation: 2644
I finally pulled myself together and wrote a python package for this.
pip install clmutils # colab-misc-utils
Create a dotenv or .env in /content/drive/MyDrive (if google drive is mounted to drive) or /content/drive/.env with
# for git
user_email = "your-email"
user_name = "your-github-name"
gh_key = "-----BEGIN EC PRIVATE KEY-----
...............................................................9
your github private key........................................J
..................................==
-----END EC PRIVATE KEY-----
"
In a Colab cell
from clmutils import setup_git, Settings
config = Settings()
setup_git(
user_name=config.user_name,
user_email=config.user_email,
priv_key=config.gh_key
)
You are then all set to do all the git cloen
, amend code, git push
stuff as if it were on your own lovely computer at home or at work.
clmutils
also has a funtion called setup_ssh_tunnel
to setup a reverse ssh tunnel to Colab. It also reads various keys, username, hostname from the .env file. It's a bit involving. But if you know how to manually set up a revers ssh tunnel to Colab, you would have no problems figuring out what they are used for. Details are available on the github repo (google clmutils pypi
).
Upvotes: 3
Reputation: 2644
Three steps to use git to sync colab with github or gitlab.
Generate a private-public key pair. Copy the private key to the system clibboard for use in step 2. Paste the public key to github or gitlab as appropriate.
In Linux, ssh-keygen can be used to generate the key-pair in ~/.ssh. The resultant private key is in the file id_rsa, the public key is in the file id_rsa.pub.
In Colab, execute
key = \
'''
paste the private key here
(your id_rsa or id_ecdsa file in the .ssh directory, e.g.
-----BEGIN EC PRIVATE KEY-----
M..............................................................9
...............................................................J
..................................==
-----END EC PRIVATE KEY-----
'''
! mkdir -p /root/.ssh
with open(r'/root/.ssh/id_rsa', 'w', encoding='utf8') as fh:
fh.write(key)
! chmod 600 /root/.ssh/id_rsa
! ssh-keyscan github.com >> /root/.ssh/known_hosts
# test setup
! ssh -T [email protected]
# if you see something like "Hi ffreemt! You've successfully
# authenticated, but GitHub does not provide shell access."
# you are all set. You can tweak .ssh/config for multiple github accounts
Use git to pull/push as usual.
The same idea can be used for rsync (or ssh) bewteen colab and HostA with minor changes:
Generate a private-public key pair. Copy the private key to the system clibboard for use in step 2. Paste the public key to authorized_keys in .ssh in HostA.
In Colab, execute
key = \
'''
paste the private key here
'''
! mkdir -p /root/.ssh
with open(r'/root/.ssh/id_rsa', 'w', encoding='utf8') as fh:
fh.write(key)
! chmod 600 /root/.ssh/id_rsa
! ssh -oStrictHostKeyChecking=no root@HostA hostnam # ssh-keyscan
HostA >> /root/.ssh/known_hosts does not seem to work with IP.
Upvotes: 9
Reputation: 378
I tried some of the methods here and they all worked well, but an issue I faced was, it became difficult to handle all the git commands and other related commands, for example version control with DVC, within notebook cells. So, I turned to this nice solution, Kora. It is a terminal emulator that can be run with in colab. This gives the ease of usage very similar to a terminal in local machine. The notebook will be still alive and we can edit files and cells as usual. Since this console is temporary, no information is exposed. GitHub login and other commands can be run as usual.
Kora: https://pypi.org/project/kora/
Usage:
!pip install kora
from kora import console
console.start()
Upvotes: 4
Reputation: 1816
Cloning a private repo to google colab :
Generate a token:
Settings -> Developer settings -> Personal access tokens -> Generate new token
Copy the token and clone the repo (replace username and token accordingly)
!git clone https://username:[email protected]/username/repo_name.git
Upvotes: 12
Reputation: 6853
The very simple and easy way to clone your private github repo in Google colab is as below.
import os
from getpass import getpass
import urllib
user = input('User name: ')
password = getpass('Password: ')
password = urllib.parse.quote(password) # your password is converted into url format
repo_name = input('Repo name: ')
cmd_string = 'git clone https://{0}:{1}@github.com/{0}/{2}.git'.format(user, password, repo_name)
os.system(cmd_string)
cmd_string, password = "", "" # removing the password from the variable
Upvotes: 50
Reputation: 15837
The solution https://stackoverflow.com/a/53094151/3924118 did not work for me because the expression {user}
was not being converted to the actual username (I was getting a 400 bad request), so I slightly changed that solution to the following one.
from getpass import getpass
import os
os.environ['USER'] = input('Enter the username of your Github account: ')
os.environ['PASSWORD'] = getpass('Enter the password of your Github account: ')
os.environ['REPOSITORY'] = input('Enter the name of the Github repository: ')
os.environ['GITHUB_AUTH'] = os.environ['USER'] + ':' + os.environ['PASSWORD']
!rm -rf $REPOSITORY # To remove the previous clone of the Github repository
!git clone https://[email protected]/$USER/$REPOSITORY.git
os.environ['USER'] = os.environ['PASSWORD'] = os.environ['REPOSITORY'] = os.environ['GITHUB_AUTH'] = ""
If you are able to clone your-repo
, you should not see any password in the output of this command. If you get an error, the password could be displayed to the output, so make sure you do not share your notebook whenever this command fails.
Upvotes: 4
Reputation: 1167
This works if you want to share your repo and colab. Also works if you have multiple repos. Just throw it in a cell.
import ipywidgets as widgets
from IPython.display import display
import subprocess
class credentials_input():
def __init__(self, repo_name):
self.repo_name = repo_name
self.username = widgets.Text(description='Username', value='')
self.pwd = widgets.Password(description = 'Password', placeholder='password here')
self.username.on_submit(self.handle_submit_username)
self.pwd.on_submit(self.handle_submit_pwd)
display(self.username)
def handle_submit_username(self, text):
display(self.pwd)
return
def handle_submit_pwd(self, text):
cmd = f'git clone https://{self.username.value}:{self.pwd.value}@{self.repo_name}'
process = subprocess.Popen(cmd.split(), stdout=subprocess.PIPE)
output, error = process.communicate()
print(output, error)
self.username.value, self.pwd.value = '', ''
get_creds = credentials_input('github.com/username/reponame.git')
get_creds
Upvotes: 0
Reputation: 337
In order to protect your account username and password, you can use getPass
and concatenate them in the shell command:
from getpass import getpass
import os
user = getpass('BitBucket user')
password = getpass('BitBucket password')
os.environ['BITBUCKET_AUTH'] = user + ':' + password
!git clone https://[email protected]/{user}/repository.git
Upvotes: 18
Reputation: 3109
Mount the drive using:
from google.colab import drive
drive.mount('/content/drive/')
Then:
%cd /content/drive/
To clone the repo in your drive
!git clone <github repo url>
Access other files from the repo(example: helper.py is another file in repo):
import imp
helper = imp.new_module('helper')
exec(open("drive/path/to/helper.py").read(), helper.__dict__)
Upvotes: 0
Reputation: 311
You can use ssh protocol to connect your private repository with colab
Generate ssh key pairs on your local machine, don't forget to keep
the paraphrase empty, check this tutorial.
Upload it to colab, check the following screenshot
from google.colab import files
uploaded = files.upload()
Move the ssh kay pairs to /root and connect to git
! rm -rf /root/.ssh/*
! mkdir /root/.ssh
! tar -xvzf ssh.tar.gz
! cp ssh/* /root/.ssh && rm -rf ssh && rm -rf ssh.tar.gz
! chmod 700 /root/.ssh
! ssh-keyscan gitlab.com >> /root/.ssh/known_hosts
! chmod 644 /root/.ssh/known_hosts
! git config --global user.email "email"
! git config --global user.name "username"
! ssh [email protected]
Authenticate your private repository, please check this Per-repository deploy keys.
Use ! [email protected]:{account}/{projectName}.git
note: to use push, you have to give write access for
the public ssh key that you authenticate git server with.
Upvotes: 19
Reputation: 38579
git
is installed on the machine, and you can use !
to invoke shell commands.
For example, to clone a git
repository:
!git clone https://github.com/fastai/courses.git
Here's a complete example that clones a repository and loads an Excel file stored therein. https://colab.research.google.com/notebook#fileId=1v-yZk-W4YXOxLTLi7bekDw2ZWZXWW216
Upvotes: 70