Dave
Dave

Reputation: 7400

converting character or integer to md5 hash using python script

I used SQL to convert a social security number to MD5 hash. I am wondering if there is a module or function in python/pandas that can do the same thing.

My sql script is:

CREATE OR REPLACE FUNCTION MD5HASH(STR IN VARCHAR2) RETURN VARCHAR2 IS
  V_CHECKSUM VARCHAR2(32);

BEGIN
  V_CHECKSUM := LOWER(RAWTOHEX(UTL_RAW.CAST_TO_RAW(SYS.DBMS_OBFUSCATION_TOOLKIT.MD5(INPUT_ST    RING => STR))));
  RETURN V_CHECKSUM;
EXCEPTION
  WHEN NO_DATA_FOUND THEN
    NULL;
  WHEN OTHERS THEN
    RAISE;
END MD5HASH;

SELECT HRPRO.MD5HASH('555555555') FROM DUAL

thanks.

I apologize, now that I read back over my initial question it is quite confusing.

I have a data frame that contains the following headings:

df[['ssno','regions','occ_ser','ethnicity','veteran','age','age_category']][:10]

Where ssno is personal information that I would like to convert to an md5 hash number and then create a new column into the dataframe.

thanks... sorry for the confusion.

Right now I have to send my file to Oracle and then convert the ssn to hash and then export back out so that I can continue working with it in Pandas. I want to eliminate this step.

Upvotes: 1

Views: 1081

Answers (2)

user123
user123

Reputation: 5407

hashlib with md5 might be of your interest.

import hashlib
hashlib.md5("Nobody inspects the spammish repetition").hexdigest()

output:

bb649c83dd1ea5c9d9dec9a18df0ffe9

Constructors for hash algorithms that are always present in this module are md5(), sha1(), sha224(), sha256(), sha384(), and sha512().

If you want more condensed result, then you may try sha series

output for sha224:

'a4337bc45a8fc544c03f52dc550cd6e1e87021bc896588bd79e901e2'

For more details : hashlib

Upvotes: 1

PM 2Ring
PM 2Ring

Reputation: 55469

Using the standard hashlib module:

import hashlib

hash = hashlib.md5()
hash.update('555555555')
print hash.hexdigest()

output

3665a76e271ada5a75368b99f774e404

As mentioned in timkofu's comment, you can also do this more simply, using

print hashlib.md5('555555555').hexdigest()

The .update() method is useful when you want to generate a checksum in stages. Please see the hashlib documentation (or the Python 3 version) for further details.

Upvotes: 2

Related Questions