Reputation: 57
My json data looks like this:
[
{
"fields": {
"bkdate": null,
"business_credit_card_total_balances": null,
"business_credit_card_total_limits": null,
"business_total_monthly_debt_payments": null,
"business_total_mortgage_monthly_payments": null,
"created_at": "2016-08-04T00:29:03.067Z",
"detail_results": null,
"error_reason": "no reasons",
"fico_v2": "695",
"fico_v3": null,
"loanapp_id": 194,
"personal_credit_card_total_balances": null,
"personal_credit_card_total_limits": null,
"personal_total_monthly_payments": null,
"report_type": "CreditProfile",
"result": true,
"role": "applicant",
"total_mortgage_monthly_payments": null,
"username": "cho",
"version": "CF Microloan",
"xml_data": "<?xml version=\"1.0\" standalone=\"no\"?><NetConnectResponse xmlns=\"http://www.experian.com/NetConnectResponse\"><CompletionCode>0000</CompletionCode>"
},
"model": "common.prequalresult",
"pk": 1
}
]
I Want to convert it into SQL table. I tried this:
v = pd.DataFrame(data['fields'])
t = pd.io.json.json_normalize(data['fields'], ['model'], ['pk'], meta_prefix='parent_')
v.to_sql('fields', engine)
t.to_sql('fields', engine)
But it does not work. Can someone work it and output the SQL table?
My errors looks like this:
TypeError Traceback (most recent call last)
<ipython-input-86-a186308b321b> in <module>()
1
----> 2 v = pd.DataFrame(data['fields'])
3 t = pd.io.json.json_normalize(data['fields'], ['model'], ['pk'], meta_prefix='parent_')
4
5 v.to_sql('fields', engine)
TypeError: list indices must be integers or slices, not str
I want to create two tables. One include 'fields', 'model' and 'pk'.And another table include all the values in 'fields'.
The schema looks like this: enter image description here
Upvotes: 1
Views: 2278
Reputation: 637
Considering your data which you want to store in Mysql DB. You can use a One-to-One relationship for storing in DB. Instead of using pandas DataFrame it can be handled properly using SQLAlchemy
ORM. The following code gives a better understanding to solve this issue. Right now I have considered only few keys from fields
key.
from sqlalchemy import Column, Integer, String, Text, DateTime, Float, Boolean, ForeignKey
from sqlalchemy.orm import relationship, sessionmaker
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy import create_engine
total_data = [
{
"fields": {
"bkdate": None,
"business_credit_card_total_balances": None,
"created_at": "2016-08-04T00:29:03.067Z",
},
"model": "common.prequalresult",
"pk": 1
},
{
"fields": {
"bkdate": "2016-08-04T00:29:03.067Z",
"business_credit_card_total_balances": 23,
"created_at": "2016-08-04T00:29:03.067Z",
},
"model": "common.prequalresult",
"pk": 2
},
{
"fields": {
"bkdate": "asdfas",
"business_credit_card_total_balances": 1111,
"created_at": "2016-08-04T00:29:03.067Z",
},
"model": "common.prequalresult",
"pk": 3
}
]
engine = create_engine('mysql://user:password@localhost:5432/my_data', echo=False)
Base = declarative_base()
class Article(Base):
__tablename__ = 'article'
id = Column(Integer, primary_key=True, autoincrement=True)
pk = Column(Integer, primary_key=False)
model = Column(String(100), nullable=True)
child = relationship('Comment', backref='article', uselist=False)
class Comment(Base):
__tablename__ = 'comment'
id = Column(Integer, primary_key=True, autoincrement=True)
bkdate = Column(String(100), nullable=True)
business_credit_card_total_balances = Column(Integer, nullable=True)
created_at = Column(String(100), nullable=True)
article_id = Column(Integer, ForeignKey('article.id'))
x = Base.metadata.create_all(engine)
Session = sessionmaker(bind=engine)
session = Session()
for temp_data in total_data:
pk = temp_data['pk']
model = temp_data['model']
bkdate = temp_data['fields']['bkdate']
business_credit_card_total_balances = temp_data['fields']['business_credit_card_total_balances']
created_at = temp_data['fields']['created_at']
parent1 = Article(pk=temp_data['pk'], model=model)
child = Comment(bkdate=bkdate,
business_credit_card_total_balances=business_credit_card_total_balances,
created_at=created_at,
article=parent1)
session.add(parent1)
session.add(child)
session.commit()
As field pk given in each dict can be duplicated, so I have created id as a primary key.
Output:
mysql> select * from article;
+----+------+----------------------+
| id | pk | model |
+----+------+----------------------+
| 1 | 1 | common.prequalresult |
| 2 | 2 | common.prequalresult |
| 3 | 3 | common.prequalresult |
+----+------+----------------------+
mysql> select * from comment;
+----+--------------------------+-------------------------------------+--------------------------+------------+
| id | bkdate | business_credit_card_total_balances | created_at | article_id |
+----+--------------------------+-------------------------------------+--------------------------+------------+
| 1 | NULL | NULL | 2016-08-04T00:29:03.067Z | 1 |
| 2 | 2016-08-04T00:29:03.067Z | 23 | 2016-08-04T00:29:03.067Z | 2 |
| 3 | asdfas | 1111 | 2016-08-04T00:29:03.067Z | 3 |
+----+--------------------------+-------------------------------------+--------------------------+------------+
Upvotes: 1