Reputation: 489
I have a number of variables in python that I want to use to generate a unique ID for those variables (yet have that ID always produce for those same matching variables).
I have used .encode('hex','strict')
to produce an ID which seems to work, however the output value is very long. Is there a way to produce a shorter ID using variables?
myname = 'Midavalo'
mydate = '5 July 2017'
mytime = '8:19am'
codec = 'hex'
print "{}{}{}".format(myname, mydate, mytime).encode(codec,'strict')
This outputs
4d69646176616c6f35204a756c792032303137383a3139616d
I realise with hex
it is probably dependant on the length of the three variables, so I'm wondering if there is another codec that can/will produce shorter values without excluding any of the variables?
So far I have tested base64
, bz2
, hex
, quopri
, uu
, zip
from 7.8.4. Python Specific Encodings, but I'm unsure how to get any of these to produce shorter values without removing variables.
Is there another codec I could use, or a way to shorten the values from any of them without removing the uniqueness, or even a completely different way to produce what I require?
All I am trying to do is produce an ID so I can identify those rows when loading them into a database. If the same value already exists it will not create a new row in the database. There is no security requirement, just a unique ID. The values are generated elsewhere into python, so I can't just use a database issued ID for these values.
Upvotes: 4
Views: 8324
Reputation: 1
If you are not specific to hash and just want a uniq value based on the two or more strings. It concatenates the first character of every string and outputs a uniq value
#prints HKRC1LB for two string1 and string2
#Concatenate first char of all strings to get a uniq id
def get_uniq_val(*args):
id = ""
for i in args:
for j in i.split():
id += j[0]
return id
def main():
string_1 = "Howard Kid Recreation Centre"
string_2 = "150 Lantern Blvd"
uid = get_uniq_val(string_1,string_2)
print(uid)
if __name__ == "__main__":
main()
Upvotes: 0
Reputation: 852
You could use some hashing algorithm from the hashlib package: https://docs.python.org/3/library/hashlib.html or for python 2: https://docs.python.org/2.7/library/hashlib.html
import hashlib
s = "some string"
hash = hashlib.sha1(str.encode(s)).hexdigest() # you need to encode the strings into bytes here
This hash would be the same for the same string. Your choice of algorithm depends of the number of chars you want and the risk of collision(two different strings yielding the same hash).
Upvotes: 10