Reputation: 1628
So i have the following sample json, contained into a series of mine
s = pd.Series(['{"city":"Uberlândia","bot-origin":null,"campaign-source":"carrinho-abandonado-ecommerce-sms","lastState":"productAvailabilityCepInDatabaseEqualTrue","main-installation-date":null,"userid":"[email protected]","full-name":null,"alternative-installation-date":null,"chosen-product":"Internet","bank":null,"postalcode":"38405328","due-date":null,"cpf":"01548226041","origin-link":"","payment":null,"state":"MG","api-orders-hash-id":null,"email":null,"api-orders-error":null,"plan-name":null,"userphone":"34 9342-8011","plan-offer":null,"completed-address":"38405328 - R IGUACU, 1289 - UMUARAMA - null - Uberlândia - MG","type-of-person":"CPF","onboarding-simplified":null,"type-of-product":"Residencial","main-installation-period-day":null,"plan-value":null,"alternative-installation-period-day":null,"data-change":"false"}',
'{"city":"Uberlândia","bot-origin":null,"campaign-source":"carrinho-abandonado-ecommerce-sms","lastState":"productAvailabilityCepInDatabaseEqualTrue","main-installation-date":null,"userid":"[email protected]","full-name":null,"alternative-installation-date":null,"chosen-product":"Internet","bank":null,"postalcode":"38405328","due-date":null,"cpf":"01548226041","origin-link":"","payment":null,"state":"MG","api-orders-hash-id":null,"email":null,"api-orders-error":null,"plan-name":null,"userphone":"34 9342-8011","plan-offer":null,"completed-address":"38405328 - R IGUACU, 1289 - UMUARAMA - null - Uberlândia - MG","type-of-person":"CPF","onboarding-simplified":null,"type-of-product":"Residencial","main-installation-period-day":null,"plan-value":null,"alternative-installation-period-day":null,"data-change":"false"}',
'{"city":"Uberlândia","bot-origin":null,"campaign-source":"carrinho-abandonado-ecommerce-sms","lastState":"productAvailabilityAddressConfirmation","main-installation-date":null,"userid":"[email protected]","full-name":null,"alternative-installation-date":null,"chosen-product":"Internet","bank":null,"postalcode":"38405328","due-date":null,"cpf":"01548226041","origin-link":"","payment":null,"state":"MG","api-orders-hash-id":null,"email":null,"api-orders-error":null,"plan-name":null,"userphone":"34 9342-8011","plan-offer":null,"completed-address":"38405328 - R IGUACU, 1289 - UMUARAMA - null - Uberlândia - MG","type-of-person":"CPF","onboarding-simplified":null,"type-of-product":"Residencial","main-installation-period-day":null,"plan-value":null,"alternative-installation-period-day":null,"data-change":"false"}'])
I dont actually know a way of exploding this type of json, into columns, would appreciate some help? I tried json_normalize loads and so on but i get empty results.
Wanted result would be something like this:
df = pd.DataFrame({'city':['Uberlandia','Uberlandia','Uberlandia'],'bot-origin':[null,null,null]}) # There are more columns but you get the jist.
Since there were a lot of answers would appreciate if someone showed me the most time efficient way, have lots of rows.
Upvotes: 0
Views: 28
Reputation: 5433
The following should work
import json
s = pd.Series(['{"city":"Uberlândia","bot-origin":null,"campaign-source":"carrinho-abandonado-ecommerce-sms","lastState":"productAvailabilityCepInDatabaseEqualTrue","main-installation-date":null,"userid":"[email protected]","full-name":null,"alternative-installation-date":null,"chosen-product":"Internet","bank":null,"postalcode":"38405328","due-date":null,"cpf":"01548226041","origin-link":"","payment":null,"state":"MG","api-orders-hash-id":null,"email":null,"api-orders-error":null,"plan-name":null,"userphone":"34 9342-8011","plan-offer":null,"completed-address":"38405328 - R IGUACU, 1289 - UMUARAMA - null - Uberlândia - MG","type-of-person":"CPF","onboarding-simplified":null,"type-of-product":"Residencial","main-installation-period-day":null,"plan-value":null,"alternative-installation-period-day":null,"data-change":"false"}',
'{"city":"Uberlândia","bot-origin":null,"campaign-source":"carrinho-abandonado-ecommerce-sms","lastState":"productAvailabilityCepInDatabaseEqualTrue","main-installation-date":null,"userid":"[email protected]","full-name":null,"alternative-installation-date":null,"chosen-product":"Internet","bank":null,"postalcode":"38405328","due-date":null,"cpf":"01548226041","origin-link":"","payment":null,"state":"MG","api-orders-hash-id":null,"email":null,"api-orders-error":null,"plan-name":null,"userphone":"34 9342-8011","plan-offer":null,"completed-address":"38405328 - R IGUACU, 1289 - UMUARAMA - null - Uberlândia - MG","type-of-person":"CPF","onboarding-simplified":null,"type-of-product":"Residencial","main-installation-period-day":null,"plan-value":null,"alternative-installation-period-day":null,"data-change":"false"}',
'{"city":"Uberlândia","bot-origin":null,"campaign-source":"carrinho-abandonado-ecommerce-sms","lastState":"productAvailabilityAddressConfirmation","main-installation-date":null,"userid":"[email protected]","full-name":null,"alternative-installation-date":null,"chosen-product":"Internet","bank":null,"postalcode":"38405328","due-date":null,"cpf":"01548226041","origin-link":"","payment":null,"state":"MG","api-orders-hash-id":null,"email":null,"api-orders-error":null,"plan-name":null,"userphone":"34 9342-8011","plan-offer":null,"completed-address":"38405328 - R IGUACU, 1289 - UMUARAMA - null - Uberlândia - MG","type-of-person":"CPF","onboarding-simplified":null,"type-of-product":"Residencial","main-installation-period-day":null,"plan-value":null,"alternative-installation-period-day":null,"data-change":"false"}'])
df = s.apply(lambda row: pd.Series(json.loads(row)))
Output:
city | bot-origin | campaign-source | lastState | main-installation-date | userid | full-name | alternative-installation-date | chosen-product | bank | postalcode | due-date | cpf | origin-link | payment | state | api-orders-hash-id | api-orders-error | plan-name | userphone | plan-offer | completed-address | type-of-person | onboarding-simplified | type-of-product | main-installation-period-day | plan-value | alternative-installation-period-day | data-change | ||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | Uberlândia | carrinho-abandonado-ecommerce-sms | productAvailabilityCepInDatabaseEqualTrue | [email protected] | Internet | 38405328 | 01548226041 | MG | 34 9342-8011 | 38405328 - R IGUACU, 1289 - UMUARAMA - null - Uberlândia - MG | CPF | Residencial | false | |||||||||||||||||
1 | Uberlândia | carrinho-abandonado-ecommerce-sms | productAvailabilityCepInDatabaseEqualTrue | [email protected] | Internet | 38405328 | 01548226041 | MG | 34 9342-8011 | 38405328 - R IGUACU, 1289 - UMUARAMA - null - Uberlândia - MG | CPF | Residencial | false | |||||||||||||||||
2 | Uberlândia | carrinho-abandonado-ecommerce-sms | productAvailabilityAddressConfirmation | [email protected] | Internet | 38405328 | 01548226041 | MG | 34 9342-8011 | 38405328 - R IGUACU, 1289 - UMUARAMA - null - Uberlândia - MG | CPF | Residencial | false |
The null
values are converted to None
. If you want to convert them to NaN you can use .fillna(np.nan)
:
import json
import numpy as np
df = (
s.apply(lambda row: pd.Series(json.loads(row)))
.fillna(np.nan)
)
Upvotes: 0
Reputation: 23099
Looks like you have string
formatted json objects, first we need to convert them then pass them into a json_normalize
method.
import json
df1 = pd.json_normalize(s.map(json.loads))
city bot-origin campaign-source lastState ... main-installation-period-day plan-value alternative-installation-period-day data-change
0 Uberlândia None carrinho-abandonado-ecommerce-sms productAvailabilityCepInDatabaseEqualTrue ... None None None false
1 Uberlândia None carrinho-abandonado-ecommerce-sms productAvailabilityCepInDatabaseEqualTrue ... None None None false
2 Uberlândia None carrinho-abandonado-ecommerce-sms productAvailabilityAddressConfirmation ... None None None false
Upvotes: 0
Reputation: 7923
Try this:
df = pd.DataFrame.from_records(s.map(json.loads))
print(df)
only printed a part of the whole df for presentation.
city bot-origin campaign-source lastState main-installation-date
0 Uberlândia None carrinho-abandonado-ecommerce-sms productAvailabilityCepInDatabaseEqualTrue None
1 Uberlândia None carrinho-abandonado-ecommerce-sms productAvailabilityCepInDatabaseEqualTrue None
2 Uberlândia None carrinho-abandonado-ecommerce-sms productAvailabilityAddressConfirmation None
Upvotes: 1