Reputation: 4582
df.apply
is a method that can apply a certain function to all the columns in a dataframe, or the required columns. However, my aim is to compute the hash of a string: this string is the concatenation of all the values in a row corresponding to all the columns. My current code is returning NaN
.
The current code is:
df["row_hash"] = df["row_hash"].apply(self.hash_string)
The function self.hash_string
is:
def hash_string(self, value):
return (sha1(str(value).encode('utf-8')).hexdigest())
Yes, it would be easier to merge all columns of Pandas dataframe but current answer couldn't help me either.
The file that I am reading is(the first 10 rows):
16012,16013,16014,16015,16016,16017,16018,16019,16020,16021,16022
16013,16014,16015,16016,16017,16018,16019,16020,16021,16022,16023
16014,16015,16016,16017,16018,16019,16020,16021,16022,16023,16024
16015,16016,16017,16018,16019,16020,16021,16022,16023,16024,16025
16016,16017,16018,16019,16020,16021,16022,16023,16024,16025,16026
The col names are: col_test_1, col_test_2, .... , col_test_11
Upvotes: 2
Views: 7524
Reputation: 4343
You can use apply
twice, first on the row elements then on the result:
df.apply(lambda x: ''.join(x.astype(str)),axis=1).apply(self.hash_string)
Sidenote: I don't understand why you are defining hash_string as an instance method (instead of a plain function), since it doesn't use the self
argument. In case you have problems can just pass it as function:
df.apply(lambda x: ''.join(x.astype(str)),axis=1).apply(lambda value: sha1(str(value).encode('utf-8')).hexdigest())
Upvotes: 3
Reputation: 1266
You can create a new column, which is concatenation of all others:
df['new'] = df.astype(str).values.sum(axis=1)
And then apply your hash function on it
df["row_hash"] = df["new"].apply(self.hash_string)
or this one-row should work:
df["row_hash"] = df.astype(str).values.sum(axis=1).apply(hash_string)
However, not sure if you need a separate function here, so:
df["row_hash"] = df.astype(str).values.sum(axis=1).apply(lambda x: sha1(str(x).encode('utf-8')).hexdigest())
Upvotes: 4