spt hsb
spt hsb

Reputation: 95

json from kafka cant convert to pandas

hey i have the code like this for consume kafka data

 bootstrap_servers = ['localhost:9092']
 topicName = 'testapp5'
 consumer = KafkaConsumer (topicName, group_id ='group1',bootstrap_servers = bootstrap_servers)
 for msg in consumer:
       print("Topic Name=%s,Message=%s"%(msg.topic,msg.value))

and then i want to load the data with

  message = json.loads(msg.value)    

the output:

 {'request_id': 'f84c55fd-c730-49ba-83b2-47b04643b706',
'data': {'age': 24,
'workclass': 'Self-emp-not-inc',
'fnlwgt': 188274,
'education': 'Bachelors',
'marital_status': 'Never-married',
'occupation': 'Sales',
'relationship': 'Not-in-family',
'race': 'White',
'gender': 'Male',
'capital_gain': 0,
'capital_loss': 0,
'hours_per_week': 50,
'native_country': 'United-States',
'income_bracket': '<=50K.'}} 

and then i want to change the data to pandas dataframe with

  row = pd.DataFrame(message, index=[0])

and the output:

output

what should i do to make json from kafka can access with pandas dataframe? thanks before

Upvotes: 0

Views: 608

Answers (1)

Rob Raymond
Rob Raymond

Reputation: 31166

This simplest approach is to use json_normalize. If you just want data you can use pd.DataFrame using dict key.

js = {'request_id': 'f84c55fd-c730-49ba-83b2-47b04643b706',
'data': {'age': 24,
'workclass': 'Self-emp-not-inc',
'fnlwgt': 188274,
'education': 'Bachelors',
'marital_status': 'Never-married',
'occupation': 'Sales',
'relationship': 'Not-in-family',
'race': 'White',
'gender': 'Male',
'capital_gain': 0,
'capital_loss': 0,
'hours_per_week': 50,
'native_country': 'United-States',
'income_bracket': '<=50K.'}} 
# simplest....
pd.json_normalize(js)
# if requestid is not needed
pd.DataFrame(js["data"], index=[0])

Upvotes: 1

Related Questions