Reputation:
I need to suck data from stdin and create a object.
The incoming data is between 5 and 10 lines long. Each line has a process number and either an IP address or a hash. For example:
pid=123 ip=192.168.0.1 - some data
pid=123 hash=ABCDEF0123 - more data
hash=ABCDEF123 - More data
ip=192.168.0.1 - even more data
I need to put this data into a class like:
class MyData():
pid = None
hash = None
ip = None
lines = []
I need to be able to look up the object by IP, HASH, or PID.
The tough part is that there are multiple streams of data intermixed coming from stdin. (There could be hundreds or thousands of processes writing data at the same time.)
I have regular expressions pulling out the PID, IP, and HASH that I need, but how can I access the object by any of those values?
My thought was to do something like this:
myarray = {}
for each line in sys.stdin.readlines():
if pid and ip: #If we can get a PID out of the line
myarray[pid] = MyData().pid = pid #Create a new MyData object, assign the PID, and stick it in myarray accessible by PID.
myarray[pid].ip = ip #Add the IP address to the new object
myarray[pid].lines.append(data) #Append the data
myarray[ip] = myarray[pid] #Take the object by PID and create a key from the IP.
<snip>do something similar for pid and hash, hash and ip, etc...</snip>
This gives my an array with two keys (a PID and an IP) and they both point to the same object. But on the next iteration of the loop, if I find (for example) an IP and HASH and do:
myarray[hash] = myarray[ip]
The following is False:
myarray[hash] == myarray[ip]
Hopefully that was clear. I hate to admit that waaay back in the VB days, I remember being able handle objects byref instead of byval. Is there something similar in Python? Or am I just approaching this wrong?
Upvotes: 1
Views: 1413
Reputation: 881635
Make two separate dicts (and don't call them arrays!), byip
and byhash
-- why do you need to smoosh everything together and risk conflicts?!
BTW, you't possibly can have the following two lines back to back:
myarray[hash] = myarray[ip]
assert not(myarray[hash] == myarray[ip])
To make the assert
hold you must be doing something else in between (perturbing the misnamed myarray
).
BTW squared, assignments in Python are always by reference to an object -- if you want a copy you must explicitly ask for one.
Upvotes: 1
Reputation: 798626
Python only has references.
Create the object once, and add it to all relevant keys at once.
class MyData(object):
def __init__(self, pid, ip, hash):
self.pid = pid
...
for line in sys.stdin:
pid, ip, hash = process(line)
obj = MyData(pid=pid, ip=ip, hash=hash)
if pid:
mydict[pid] = obj
if ip:
mydict[ip] = obj
if hash:
mydict[hash] = obj
Upvotes: 2