Reputation: 597
I have an XML file like this i need to insert this data to PostgreSQL DB.Below is the sample XML and the code which i use ,but i'm not getting any output,Can someone please guide on how to effectively fetch these XML values.
<?xml version="1.0" encoding="utf-8"?>
<rss xmlns:g="http://base.google.com/ns/1.0" version="2.0" encoding="utf-8">
<config>
<g:system>Magento</g:system>
<g:extension>Magmodules_Googleshopping</g:extension>
<g:extension_version>1.6.8</g:extension_version>
<g:store>emb</g:store>
<g:url>https://www.xxxxx.com/</g:url>
<g:products>1320</g:products>
<g:generated>2020-06-11 11:18:32</g:generated>
<g:processing_time>17.5007</g:processing_time>
</config>
<channel>
<item>
<g:id>20</g:id>
<g:title>product 1</g:title>
<g:description>description about product 1</g:description>
<g:gtin>42662</g:gtin>
<g:brand>company</g:brand>
<g:mpn>0014</g:mpn>
<g:link>link.html</g:link>
<g:image_link>link/c/a/cat_21_16.jpg</g:image_link>
<g:availability>in stock</g:availability>
<g:condition>new</g:condition>
<g:price>9</g:price>
<g:shipping>
<g:country>UAE</g:country>
<g:service>DHL</g:service>
<g:price>2.90</g:price>
</g:shipping>
</item>
<item>
.
.
.
</item>
Below is the script which i use, Python : 3.5 Postgres version 11
# import modules
import sys
import psycopg2
import datetime
now = datetime.datetime.now()
# current data and time
dt = now.strftime("%Y%m%dT%H%M%S")
# xml tree access
#from xml.etree import ElementTree
import xml.etree.ElementTree as ET
# incremental variable
x = 0
with open('/Users/admin/documents/shopping.xml', 'rt',encoding="utf8") as f:
#tree = ElementTree.parse(f)
tree = ET.parse(f)
# connection to postgreSQL database
try:
conn=psycopg2.connect(host='localhost', database='postgres',
user='postgres', password='postgres',port='5432')
except:
print ("Hey I am unable to connect to the database.")
cur = conn.cursor()
# access the xml tree element nodes
try:
for node in tree.findall('.//item'):
src = node.find('id')
tgt = node.find('mpn')
print(node)
except:
print ("Oops I can't insert record into database table!")
conn.commit()
conn.close()
The current output i'm getting is like,
None
None
None
Expected Output,
id title description gtin ......
20 product 1 g:description xxxx .....
Upvotes: 0
Views: 404
Reputation: 142641
Strange is that you can't find item
. It seems you use wrong file and it doesn't have item
.
Using your XML
data as string and ET.fromstring()
I have no problem to get item
.
Maybe check print( f.read() )
to see what you really read from file.
Problem is only id
, tgt
which use namespace
- g:
- and it need something more then only g:id
, g:tgt
tree = ET.fromstring(xml)
ns = {'g': "http://base.google.com/ns/1.0"}
for node in tree.findall('.//item'):
src = node.find('g:id', ns)
tgt = node.find('g:mpn', ns)
print('Node:', node)
print('src:', src.text)
print('tgt:', tgt.text)
or use directly as '{http://base.google.com/ns/1.0}id'
'{http://base.google.com/ns/1.0}mpn'
tree = ET.fromstring(xml)
for node in tree.findall('.//item'):
src = node.find('{http://base.google.com/ns/1.0}id')
tgt = node.find('{http://base.google.com/ns/1.0}mpn')
print('Node:', node)
print('src:', src.text)
print('tgt:', tgt.text)
Minimal working code:
import xml.etree.ElementTree as ET
xml = '''<?xml version="1.0" encoding="utf-8"?>
<rss xmlns:g="http://base.google.com/ns/1.0" version="2.0" encoding="utf-8">
<config>
<g:system>Magento</g:system>
<g:extension>Magmodules_Googleshopping</g:extension>
<g:extension_version>1.6.8</g:extension_version>
<g:store>emb</g:store>
<g:url>https://www.xxxxx.com/</g:url>
<g:products>1320</g:products>
<g:generated>2020-06-11 11:18:32</g:generated>
<g:processing_time>17.5007</g:processing_time>
</config>
<channel>
<item>
<g:id>20</g:id>
<g:title>product 1</g:title>
<g:description>description about product 1</g:description>
<g:gtin>42662</g:gtin>
<g:brand>company</g:brand>
<g:mpn>0014</g:mpn>
<g:link>link.html</g:link>
<g:image_link>link/c/a/cat_21_16.jpg</g:image_link>
<g:availability>in stock</g:availability>
<g:condition>new</g:condition>
<g:price>9</g:price>
<g:shipping>
<g:country>UAE</g:country>
<g:service>DHL</g:service>
<g:price>2.90</g:price>
</g:shipping>
</item>
</channel>
</rss>
'''
tree = ET.fromstring(xml)
ns = {'g': "http://base.google.com/ns/1.0"}
for node in tree.findall('.//item'):
src = node.find('g:id', ns)
tgt = node.find('g:mpn', ns)
print('Node:', node)
print('src:', src.text)
print('tgt:', tgt.text)
Result:
Node: <Element 'item' at 0x7f74ba45b710>
src: 20
tgt: 0014
BTW: It works even when I use io.StringIO
to simulate file
f = io.StringIO(xml)
tree = ET.parse(f)
Minimal working code:
import xml.etree.ElementTree as ET
import io
xml = '''<?xml version="1.0" encoding="utf-8"?>
<rss xmlns:g="http://base.google.com/ns/1.0" version="2.0" encoding="utf-8">
<config>
<g:system>Magento</g:system>
<g:extension>Magmodules_Googleshopping</g:extension>
<g:extension_version>1.6.8</g:extension_version>
<g:store>emb</g:store>
<g:url>https://www.xxxxx.com/</g:url>
<g:products>1320</g:products>
<g:generated>2020-06-11 11:18:32</g:generated>
<g:processing_time>17.5007</g:processing_time>
</config>
<channel>
<item>
<g:id>20</g:id>
<g:title>product 1</g:title>
<g:description>description about product 1</g:description>
<g:gtin>42662</g:gtin>
<g:brand>company</g:brand>
<g:mpn>0014</g:mpn>
<g:link>link.html</g:link>
<g:image_link>link/c/a/cat_21_16.jpg</g:image_link>
<g:availability>in stock</g:availability>
<g:condition>new</g:condition>
<g:price>9</g:price>
<g:shipping>
<g:country>UAE</g:country>
<g:service>DHL</g:service>
<g:price>2.90</g:price>
</g:shipping>
</item>
</channel>
</rss>
'''
f = io.StringIO(xml)
tree = ET.parse(f)
ns = {'g': "http://base.google.com/ns/1.0"}
for node in tree.findall('.//item'):
src = node.find('{http://base.google.com/ns/1.0}id')
tgt = node.find('{http://base.google.com/ns/1.0}mpn')
print('Node:', node)
print('src:', src.text)
print('mpn:', tgt.text)
Upvotes: 1