KLZY
KLZY

Reputation: 83

looping through json python is very slow

Can someone help me understand what I'm doing wrong in the following code:

        def matchTrigTohost(gtriggerids,gettriggers):
            mylist = []
            for eachid in gettriggers:
                gtriggerids['params']['triggerids'] = str(eachid)
                hgetjsonObject = updateitem(gtriggerids,processor)
                hgetjsonObject = json.dumps(hgetjsonObject)
                hgetjsonObject = json.loads(hgetjsonObject)
                hgetjsonObject = eval(hgetjsonObject)
                hostid = hgetjsonObject["result"][0]["hostid"]
                hname = hgetjsonObject["result"][0]["name"]
                endval = hostid + "--" + hname
                mylist.append(endval)
            return(hgetjsonObject)

The variable gettriggers contain a lot of ids (~3500):


[ "26821", "26822", "26810", ..... ]

I'm looping through the ids in the variable and assigning them to a json object.


gtriggerids = {
        "jsonrpc": "2.0",
        "method": "host.get",
        "params": {
                "output": ["hostid", "name"],
                "triggerids": "26821"
        },
        "auth": mytoken,
        "id": 2
}

When I run the code against the above json variable, it is very slow. It is taking several minutes to check each ID. I'm sure I'm doing many things wrong here or at least not in the pythonic way. Can anyone help me speed this up? I'm very new to python.

NOTE:

The dump() , load(), eval() were used to convert the str produced to json.

Upvotes: 0

Views: 1834

Answers (1)

Chris
Chris

Reputation: 7278

You asked for help knowing what you're doing wrong. Happy to oblige :-)

  1. At the lowest level—why your function is running slowly—you're running many unnecessary operations. Specifically, you're moving data between formats (python dictionaries and JSON strings) and back again which accomplishes nothing but wasting CPU cycles.

You mentioned this is only way you could get the data in the format you needed. That brings me to the second thing you're doing wrong.

  1. You're throwing code at the wall instead of understanding what's happening.

I'm quite sure (and several of your commenters appear to agree) that your code is not the only way to arrange your data into a usable structure. What you should do instead is:

  • Understand as much as you can about the data you're being given. I suspect the output of updateitem() should be your first target of learning.
  • Understand the right/typical way to interact with that data. Your data doesn't have to be a dictionary before you can use it. Maybe it's not the best approach.
  • Understand what regularities and irregularities the data may have. Part of your problem may not be with types or dictionaries, but with an unpredictable/dirty data source.
  • Armed with all this new knowledge, manipulate your as simply as you can.

I can pretty much guarantee the result will run faster.


More detail! Some things you wrote suggest misconceptions:

I'm looping through the ids in the variable and assigning them to a json object.

No, you can't assign to a JSON object. In python, JSON data is always a string. You probably mean that you're assigning to a python dictionary, which (sometimes!) can be converted to a JSON object, represented as a string. Make sure you have all those concepts clear before you move forward.

The dump() , load(), eval() were used to convert the str produced to json.

Again, you don't call dumps() on a string. You use that to convert a python object to a string. Run this code in a REPL, go step by step, and inspect or play with each output to understand what it is.

Upvotes: 4

Related Questions