Reputation: 109
I am working to extract issues data from a repo on Github using Github3.py. The following is a part of my code to extract issues from a repo:
I used these libraries in the main code:
from github3 import login
from mysql.connector import IntegrityError
import config as cfg
import project_list
from github3.exceptions import NotFoundError
from github3.exceptions import GitHubException
import datetime
from database import Database
import sys
import re
import time
Then the main code is:
DEBUG = False
def process(url, start):
re_pattern = re.compile(u'[^\u0000-\uD7FF\uE000-\uFFFF]', re.UNICODE)
splitted = url.split("/")
org_name = splitted[3]
repo_name = splitted[4]
while True:
try:
gh = login(token = cfg.TOKEN)
repo = gh.repository(org_name, repo_name)
print("{} =====================".format(repo))
if start is None:
i = 1
else:
i = int(start)
if start is None:
j = 1
else:
j = int(start)
Database.connect()
while True:
try:
issue = repo.issue(i)
issue_id = issue.id
issue_number = issue.number
status_issue = str(issue.state)
close_author = str(issue.closed_by)
com_count = issue.comments_count
title = re_pattern.sub(u'\uFFFD', issue.title)
created_at = issue.created_at
closed_at = issue.closed_at
now = datetime.datetime.now()
reporter = str(issue.user)
body_text = issue.body_text
body_html = issue.body_html
if body_text is None:
body_text = ""
if body_html is None:
body_html = ""
body_text = re_pattern.sub(u'\uFFFD', body_text)
body_html = re_pattern.sub(u'\uFFFD', body_html)
Database.insert_issues(issue_id, issue_number, repo_name,status_issue , close_author, com_count, title, reporter, created_at, closed_at, now, body_text, body_html)
print("{} inserted.".format(issue_id))
if DEBUG == True:
break;
except NotFoundError as e:
print("Exception @ {}: {}".format(i, str(e)))
except IntegrityError as e:
print("Data was there @ {}".format(str(e)))
i += 1
j += 1
except GitHubException as e:
print("Exception: {}".format(str(e)))
time.sleep(1000)
i -= 1
j -= 1
if __name__ == "__main__":
if len(sys.argv) == 1:
sys.exit("Please specify project name: python issue-github3.py <project name>")
if len(sys.argv) == 2:
start = None
print("Start from the beginning")
else:
start = sys.argv[2]
project = sys.argv[1]
url = project_list.get_project(project)
process(url, start)
With the above code, everything is ok for me and I can extract issues from a repo on GitHub.
Problem: Exception: 410 Issues are disabled for this repo
occurs after 100 successful issues extraction from a repo.
How could I solve this problem?
As mentioned in the main code, I fixed the exception 404 (i.e., Not found issues) with the library of from github3.exceptions import NotFoundError
and the below code:
except NotFoundError as e:
print("Exception @ {}: {}".format(i, str(e)))
Given the main code, what library and code should I use to fix exception 410?
Upvotes: 0
Views: 193
Reputation: 109
I found an easy way to fix it but it doesn't solve the problem completely.
As I mentioned before, the exception occurs after 100 successful issue number (i.e., issue number of 101 is a problem), and as @LhasaDad said above, there is no issue number of 101 in the repo (I checked it manually). So we just need to put 102
instead of None
where start = None
, then execute the code again.
Upvotes: 0