Finding annotations data in GFF format for NCBI nucleotides using Entrez

Question

I am working with bacterial sequences from NCBI Nucleotide database. If I have an accession e.g. NC_002663 and I need the annotations in GFF, how would I easily do that using Entrez (preferably Biopython)?

If I go to the NCBI entry, I see the link to the assembly. Is there an easy way to programmatically access it? Esummary service doesn't return such links:

handle = Entrez.esummary(db='nucleotide', id='NC_002663')
record = Entrez.read(handle)

[DictElement({'Item': [], 'Id': '15601865', 'Caption': 'NC_002663', 'Title': 'Pasteurella multocida subsp. multocida str. Pm70, complete genome', 'Extra': 'gi|15601865|ref|NC_002663.1|[15601865]', 'Gi': IntegerElement(15601865, attributes={}), 'CreateDate': '2001/09/10', 'UpdateDate': '2018/01/11', 'Flags': IntegerElement(800, attributes={}), 'TaxId': IntegerElement(272843, attributes={}), 'Length': IntegerElement(2257487, attributes={}), 'Status': 'live', 'ReplacedBy': '', 'Comment': '  ', 'AccessionVersion': 'NC_002663.1'}, attributes={})]

I could maybe search the Assembly db with the "Title", but it seems there could be a better way (without as many API calls). Thanks!

Finding annotations data in GFF format for NCBI nucleotides using Entrez

Answers (1)

Related Questions