Reputation:
I want to collect specific information from data.xml with root[0] 'CaplockSet' contain more than 100 'Caplock' in which I need only author information to be extracted! Kindly help me with this, your support is highly appreciated.
<?xml version="1.0"?>
<CaplockSet>
<Caplock>
<MedlineCitation Status="clonelisher" Owner="NLM">
<PMID Version="1">32045906</PMID>
<DateRevised>
<Year>2020</Year>
<Month>02</Month>
<Day>11</Day>
</DateRevised>
<Article cloneModel="Print-Electronic">
<Journal>
<ISSN IssnType="Electronic">1423-0135</ISSN>
<JournalIssue CitedMedium="Internet">
<cloneDate>
<Year>2020</Year>
<Month>Feb</Month>
<Day>11</Day>
</cloneDate>
</JournalIssue>
<Title>Journal of vascular research</Title>
<ISOAbbreviation>J. Vasc. Res.</ISOAbbreviation>
</Journal>
<ArticleTitle>miR-96-5p Regulates Proliferation, Migration, and Apoptosis of Vascular Smooth Muscle Cell Induced by Angiotensin II via Targeting NFAT5.</ArticleTitle>
<Pagination>
<MedlinePgn>1-11</MedlinePgn>
</Pagination>
<ELocationID EIdType="doi" ValidYN="Y">10.1159/000505457</ELocationID>
<Abstract>
<AbstractText Label="BACKGROUND" NlmCategory="BACKGROUND">Aberrant proliferation, migration, and apoptosis of vascular smooth muscle cells (VSMCs) are major pathological phenomenon in hypertension. MicroRNAs (miRNAs/miRs) serve crucial roles in the progression of hypertension. We aimed to determine the role of miR-96-5p in the proliferation, migration, and apoptosis of VSMCs and its underlying mechanisms.</AbstractText>
<AbstractText Label="METHODS" NlmCategory="METHODS">Angiotensin II (Ang II) was employed to treat VSMCs, and the expression of miR-96-5p was detected by RT-qPCR. Then, miR-96-5p mimic was transfected into VSMCs. Cell Counting Kit-8 assay, flow cytometry, transwell assay, and wound healing assay were applied to measure proliferation, cell cycle, and migration of VSMCs. The expression of proteins associated with proliferation, migration, and apoptosis was assessed. A luciferase reporter assay was applied to confirm the target binding between miR-96-5p and nuclear factors of activated T-cells 5 (NFAT5). Subsequently, siRNA was used to silence NFAT5, and cell proliferation, migration, and apoptosis were assessed.</AbstractText>
<AbstractText Label="RESULTS" NlmCategory="RESULTS">The results revealed that the expression of miR-96-5p was downregulated in Ang II-induced VSMCs. MiR-96-5p overexpression inhibited cell proliferation and migration but promoted cell apoptosis, enhanced the percentages of cells in the G1 and G2 phases, and reduced those in the S phase, accompanied by changes in the expression associated proteins. NFAT5 was confirmed as a direct target of miR-96-5p. NFAT5 silencing had the same results with miR-96-5p overexpression on VSMC proliferation, migration, and apoptosis, whereas miR-96-5p inhibitor reversed these effects.</AbstractText>
<AbstractText Label="CONCLUSIONS" NlmCategory="CONCLUSIONS">Our findings concluded that miR-96-5p could regulate proliferation, migration, and apoptosis of VSMCs induced by Ang II via targeting NFAT5.</AbstractText>
<CopyrightInformation>© 2020 S. Karger AG, Basel.</CopyrightInformation>
</Abstract>
<AuthorList CompleteYN="Y">
<Author ValidYN="Y">
<LastName>Tian</LastName>
<ForeName>Long</ForeName>
<Initials>L</Initials>
<AffiliationInfo>
<Affiliation>Department of Cardiology, Jiangdu People's Hospital, Yangzhou, China.</Affiliation>
</AffiliationInfo>
</Author>
<Author ValidYN="Y">
<LastName>Cai</LastName>
<ForeName>Dinghua</ForeName>
<Initials>D</Initials>
<AffiliationInfo>
<Affiliation>Department of Cardiology, Jiangdu People's Hospital, Yangzhou, China.</Affiliation>
</AffiliationInfo>
</Author>
<Author ValidYN="Y">
<LastName>Zhuang</LastName>
<ForeName>Derong</ForeName>
<Initials>D</Initials>
<AffiliationInfo>
<Affiliation>Department of Cardiology, Jiangdu People's Hospital, Yangzhou, China.</Affiliation>
</AffiliationInfo>
</Author>
<Author ValidYN="Y">
<LastName>Wang</LastName>
<ForeName>Wenyuan</ForeName>
<Initials>W</Initials>
<AffiliationInfo>
<Affiliation>Department of Cardiology, Jiangdu People's Hospital, Yangzhou, China.</Affiliation>
</AffiliationInfo>
</Author>
<Author ValidYN="Y">
<LastName>Wang</LastName>
<ForeName>Xuan</ForeName>
<Initials>X</Initials>
<AffiliationInfo>
<Affiliation>Department of Cardiology, Jiangdu People's Hospital, Yangzhou, China.</Affiliation>
</AffiliationInfo>
</Author>
<Author ValidYN="Y">
<LastName>Bian</LastName>
<ForeName>Xiaoli</ForeName>
<Initials>X</Initials>
<AffiliationInfo>
<Affiliation>Department of Cardiology, Jiangdu People's Hospital, Yangzhou, China.</Affiliation>
</AffiliationInfo>
</Author>
<Author ValidYN="Y">
<LastName>Xu</LastName>
<ForeName>Rui</ForeName>
<Initials>R</Initials>
<AffiliationInfo>
<Affiliation>Department of Nephrology, Jiangdu People's Hospital, Yangzhou, China.</Affiliation>
</AffiliationInfo>
</Author>
<Author ValidYN="Y">
<LastName>Wu</LastName>
<ForeName>Guanji</ForeName>
<Initials>G</Initials>
<AffiliationInfo>
<Affiliation>Department of Cardiology, Xi'an Central Hospital of Xi'an Jiaotong University, Xi'an, China, [email protected].</Affiliation>
</AffiliationInfo>
</Author>
</AuthorList>
<Language>eng</Language>
<clonelicationTypeList>
<clonelicationType UI="D016428">Journal Article</clonelicationType>
</clonelicationTypeList>
<ArticleDate DateType="Electronic">
<Year>2020</Year>
<Month>02</Month>
<Day>11</Day>
</ArticleDate>
</Article>
<MedlineJournalInfo>
<Country>Switzerland</Country>
<MedlineTA>J Vasc Res</MedlineTA>
<NlmUniqueID>9206092</NlmUniqueID>
<ISSNLinking>1018-1172</ISSNLinking>
</MedlineJournalInfo>
<CitationSubset>IM</CitationSubset>
<KeywordList Owner="NOTNLM">
<Keyword MajorTopicYN="N">Migration</Keyword>
<Keyword MajorTopicYN="N">NFAT5</Keyword>
<Keyword MajorTopicYN="N">Proliferation</Keyword>
<Keyword MajorTopicYN="N">Vascular smooth muscle cell</Keyword>
<Keyword MajorTopicYN="N">miR-96-5p</Keyword>
</KeywordList>
</MedlineCitation>
<CardData>
<History>
<CardcloneDate cloneStatus="received">
<Year>2019</Year>
<Month>09</Month>
<Day>16</Day>
</CardcloneDate>
<CardcloneDate cloneStatus="accepted">
<Year>2019</Year>
<Month>12</Month>
<Day>16</Day>
</CardcloneDate>
<CardcloneDate cloneStatus="entrez">
<Year>2020</Year>
<Month>2</Month>
<Day>12</Day>
<Hour>6</Hour>
<Minute>0</Minute>
</CardcloneDate>
<CardcloneDate cloneStatus="Card">
<Year>2020</Year>
<Month>2</Month>
<Day>12</Day>
<Hour>6</Hour>
<Minute>0</Minute>
</CardcloneDate>
<CardcloneDate cloneStatus="medline">
<Year>2020</Year>
<Month>2</Month>
<Day>12</Day>
<Hour>6</Hour>
<Minute>0</Minute>
</CardcloneDate>
</History>
<clonelicationStatus>aheadofprint</clonelicationStatus>
<ArticleIdList>
<ArticleId IdType="Card">32045906</ArticleId>
<ArticleId IdType="pii">000505457</ArticleId>
<ArticleId IdType="doi">10.1159/000505457</ArticleId>
</ArticleIdList>
</CardData>
</Caplock>
</CaplockSet>
I tried multiple ways to get away with this .py code but am facing lot of errors. I elaborated one of the method below
import xml.etree.ElementTree as ET
mytree = ET.parse('data.xml')
myroot = mytree.getroot()
for x in myroot.findall('Author'):
lastname = x.find('LastName').text
forename = x.find('ForeName').text
affiliation = x.find('AffiliationInfo/Affiliation').text
print(lastname,forename,affiliation)
Error
Traceback (most recent call last):
File "c:/Users/jeeva/Desktop/data/program.py", line 3, in <module>
mytree = ET.parse('data/data.xml')
File "C:\Users\jeeva\AppData\Local\Programs\Python\Python38-32\lib\xml\etree\ElementTree.py", line 1202, in parse
tree.parse(source, parser)
File "C:\Users\jeeva\AppData\Local\Programs\Python\Python38-32\lib\xml\etree\ElementTree.py", line 595, in parse
self._root = parser._parse_whole(source)
xml.etree.ElementTree.ParseError: syntax error: line 2, column 21
Upvotes: 2
Views: 84
Reputation: 23815
One liner:
import xml.etree.ElementTree as ET
xml = '''<?xml version="1.0"?>
<CaplockSet>
<Caplock>
<MedlineCitation Status="clonelisher" Owner="NLM">
<PMID Version="1">32045906</PMID>
<DateRevised>
<Year>2020</Year>
<Month>02</Month>
<Day>11</Day>
</DateRevised>
<Article cloneModel="Print-Electronic">
<Journal>
<ISSN IssnType="Electronic">1423-0135</ISSN>
<JournalIssue CitedMedium="Internet">
<cloneDate>
<Year>2020</Year>
<Month>Feb</Month>
<Day>11</Day>
</cloneDate>
</JournalIssue>
<Title>Journal of vascular research</Title>
<ISOAbbreviation>J. Vasc. Res.</ISOAbbreviation>
</Journal>
<ArticleTitle>miR-96-5p Regulates Proliferation, Migration, and Apoptosis of Vascular Smooth Muscle Cell Induced by Angiotensin II via Targeting NFAT5.</ArticleTitle>
<Pagination>
<MedlinePgn>1-11</MedlinePgn>
</Pagination>
<ELocationID EIdType="doi" ValidYN="Y">10.1159/000505457</ELocationID>
<Abstract>
<AbstractText Label="BACKGROUND" NlmCategory="BACKGROUND">Aberrant proliferation, migration, and apoptosis of vascular smooth muscle cells (VSMCs) are major pathological phenomenon in hypertension. MicroRNAs (miRNAs/miRs) serve crucial roles in the progression of hypertension. We aimed to determine the role of miR-96-5p in the proliferation, migration, and apoptosis of VSMCs and its underlying mechanisms.</AbstractText>
<AbstractText Label="METHODS" NlmCategory="METHODS">Angiotensin II (Ang II) was employed to treat VSMCs, and the expression of miR-96-5p was detected by RT-qPCR. Then, miR-96-5p mimic was transfected into VSMCs. Cell Counting Kit-8 assay, flow cytometry, transwell assay, and wound healing assay were applied to measure proliferation, cell cycle, and migration of VSMCs. The expression of proteins associated with proliferation, migration, and apoptosis was assessed. A luciferase reporter assay was applied to confirm the target binding between miR-96-5p and nuclear factors of activated T-cells 5 (NFAT5). Subsequently, siRNA was used to silence NFAT5, and cell proliferation, migration, and apoptosis were assessed.</AbstractText>
<AbstractText Label="RESULTS" NlmCategory="RESULTS">The results revealed that the expression of miR-96-5p was downregulated in Ang II-induced VSMCs. MiR-96-5p overexpression inhibited cell proliferation and migration but promoted cell apoptosis, enhanced the percentages of cells in the G1 and G2 phases, and reduced those in the S phase, accompanied by changes in the expression associated proteins. NFAT5 was confirmed as a direct target of miR-96-5p. NFAT5 silencing had the same results with miR-96-5p overexpression on VSMC proliferation, migration, and apoptosis, whereas miR-96-5p inhibitor reversed these effects.</AbstractText>
<AbstractText Label="CONCLUSIONS" NlmCategory="CONCLUSIONS">Our findings concluded that miR-96-5p could regulate proliferation, migration, and apoptosis of VSMCs induced by Ang II via targeting NFAT5.</AbstractText>
<CopyrightInformation>© 2020 S. Karger AG, Basel.</CopyrightInformation>
</Abstract>
<AuthorList CompleteYN="Y">
<Author ValidYN="Y">
<LastName>Tian</LastName>
<ForeName>Long</ForeName>
<Initials>L</Initials>
<AffiliationInfo>
<Affiliation>Department of Cardiology, Jiangdu People's Hospital, Yangzhou, China.</Affiliation>
</AffiliationInfo>
</Author>
<Author ValidYN="Y">
<LastName>Cai</LastName>
<ForeName>Dinghua</ForeName>
<Initials>D</Initials>
<AffiliationInfo>
<Affiliation>Department of Cardiology, Jiangdu People's Hospital, Yangzhou, China.</Affiliation>
</AffiliationInfo>
</Author>
<Author ValidYN="Y">
<LastName>Zhuang</LastName>
<ForeName>Derong</ForeName>
<Initials>D</Initials>
<AffiliationInfo>
<Affiliation>Department of Cardiology, Jiangdu People's Hospital, Yangzhou, China.</Affiliation>
</AffiliationInfo>
</Author>
<Author ValidYN="Y">
<LastName>Wang</LastName>
<ForeName>Wenyuan</ForeName>
<Initials>W</Initials>
<AffiliationInfo>
<Affiliation>Department of Cardiology, Jiangdu People's Hospital, Yangzhou, China.</Affiliation>
</AffiliationInfo>
</Author>
<Author ValidYN="Y">
<LastName>Wang</LastName>
<ForeName>Xuan</ForeName>
<Initials>X</Initials>
<AffiliationInfo>
<Affiliation>Department of Cardiology, Jiangdu People's Hospital, Yangzhou, China.</Affiliation>
</AffiliationInfo>
</Author>
<Author ValidYN="Y">
<LastName>Bian</LastName>
<ForeName>Xiaoli</ForeName>
<Initials>X</Initials>
<AffiliationInfo>
<Affiliation>Department of Cardiology, Jiangdu People's Hospital, Yangzhou, China.</Affiliation>
</AffiliationInfo>
</Author>
<Author ValidYN="Y">
<LastName>Xu</LastName>
<ForeName>Rui</ForeName>
<Initials>R</Initials>
<AffiliationInfo>
<Affiliation>Department of Nephrology, Jiangdu People's Hospital, Yangzhou, China.</Affiliation>
</AffiliationInfo>
</Author>
<Author ValidYN="Y">
<LastName>Wu</LastName>
<ForeName>Guanji</ForeName>
<Initials>G</Initials>
<AffiliationInfo>
<Affiliation>Department of Cardiology, Xi'an Central Hospital of Xi'an Jiaotong University, Xi'an, China, [email protected].</Affiliation>
</AffiliationInfo>
</Author>
</AuthorList>
<Language>eng</Language>
<clonelicationTypeList>
<clonelicationType UI="D016428">Journal Article</clonelicationType>
</clonelicationTypeList>
<ArticleDate DateType="Electronic">
<Year>2020</Year>
<Month>02</Month>
<Day>11</Day>
</ArticleDate>
</Article>
<MedlineJournalInfo>
<Country>Switzerland</Country>
<MedlineTA>J Vasc Res</MedlineTA>
<NlmUniqueID>9206092</NlmUniqueID>
<ISSNLinking>1018-1172</ISSNLinking>
</MedlineJournalInfo>
<CitationSubset>IM</CitationSubset>
<KeywordList Owner="NOTNLM">
<Keyword MajorTopicYN="N">Migration</Keyword>
<Keyword MajorTopicYN="N">NFAT5</Keyword>
<Keyword MajorTopicYN="N">Proliferation</Keyword>
<Keyword MajorTopicYN="N">Vascular smooth muscle cell</Keyword>
<Keyword MajorTopicYN="N">miR-96-5p</Keyword>
</KeywordList>
</MedlineCitation>
<CardData>
<History>
<CardcloneDate cloneStatus="received">
<Year>2019</Year>
<Month>09</Month>
<Day>16</Day>
</CardcloneDate>
<CardcloneDate cloneStatus="accepted">
<Year>2019</Year>
<Month>12</Month>
<Day>16</Day>
</CardcloneDate>
<CardcloneDate cloneStatus="entrez">
<Year>2020</Year>
<Month>2</Month>
<Day>12</Day>
<Hour>6</Hour>
<Minute>0</Minute>
</CardcloneDate>
<CardcloneDate cloneStatus="Card">
<Year>2020</Year>
<Month>2</Month>
<Day>12</Day>
<Hour>6</Hour>
<Minute>0</Minute>
</CardcloneDate>
<CardcloneDate cloneStatus="medline">
<Year>2020</Year>
<Month>2</Month>
<Day>12</Day>
<Hour>6</Hour>
<Minute>0</Minute>
</CardcloneDate>
</History>
<clonelicationStatus>aheadofprint</clonelicationStatus>
<ArticleIdList>
<ArticleId IdType="Card">32045906</ArticleId>
<ArticleId IdType="pii">000505457</ArticleId>
<ArticleId IdType="doi">10.1159/000505457</ArticleId>
</ArticleIdList>
</CardData>
</Caplock>
</CaplockSet>'''
root = ET.fromstring(xml)
data = [{'Affiliation':a.find('AffiliationInfo/Affiliation').text,'ForeName': a.find('ForeName').text,'LastName': a.find('LastName').text} for a in root.findall('.//Author')]
Upvotes: 0
Reputation: 303
Maybe this should work
def find_rec(node):
for item in node.iter():
if item.tag == "Author":
author_values = {}
for i in item.iter():
author_values[i.tag] = i.text
yield author_values
auth = find_rec(ET.parse('./data.xml').getroot())
for x in auth:
print(x["LastName"], x["ForeName"], x["Affiliation"])
Upvotes: 2