aabujamra
aabujamra

Reputation: 4636

Python // Pandas - Get json from API and turn into dataframe

I'm using this API to get companies data: https://github.com/vkruoso/receita-tools

Here you can see how the registry comes to me (which seems as a json structure): https://www.receitaws.com.br/v1/cnpj/27865757000102

I'm able to download it using the following:

cadastro = os.system("curl -X GET https://www.receitaws.com.br/v1/cnpj/27865757000102"

If I run type(cadastro) it shows up class 'int' to me. I want to turn that into a dataframe. How could I do that?

Upvotes: 2

Views: 3311

Answers (1)

kaidokuuppa
kaidokuuppa

Reputation: 642

os.system returns the exit code not the data. You should use subprocess instead, see Assign output of os.system to a variable and prevent it from being displayed on the screen.

If you are using python 3.5+, you should use subprocess.run()

import subprocess
import json
import pandas as pd

proc = subprocess.run(["curl",  "-X", "GET",  
                  "https://www.receitaws.com.br/v1/cnpj/27865757000102"],
                   stdout=subprocess.PIPE, encoding='utf-8')

cadastro = proc.stdout
df = pd.DataFrame([json.loads(cadastro)])

Otherwise, use subprocess.Popen()

import subprocess
import json
import pandas as pd

proc = subprocess.Popen(["curl",  "-X", "GET",  
                  "https://www.receitaws.com.br/v1/cnpj/27865757000102"],
                   stdout=subprocess.PIPE)

cadastro, err = proc.communicate()
df = pd.DataFrame([json.loads(cadastro)])

Or, you can use the Requests library.

import json
import requests
import pandas as pd

response = requests.get("https://www.receitaws.com.br/v1/cnpj/27865757000102")
data = json.loads(response.content.decode(response.encoding))
df = pd.DataFrame([data])

Upvotes: 5

Related Questions