David
David

Reputation: 65

Decoding a single column in a CSV file using python 3 Base64

I am very new and inexperienced to Python but I hope someone can help me with this. I didn't find any (understandable?) answers on google.

I have a large (10gb) CSV file that contains multiple columns. All columns are "normal" human readable text except for one column. This column is binary. I would like to decode this and write it the decoded data back into the CSV file.

This is what I got so far, but I have a feeling I'm way off. Any help would be appreciated!

import base64
import pandas as pd



df = pd.read_csv('sample.csv', delimiter=';',
                 usecols=[3], dtype=object, header=None,)
decoded_binary_data = base64.b64decode(df)

print(decoded_binary_data)

sample of CSV:

"5f8ebfd8-7d12-4659-a416-e5dcbe056d0a";"6";"1";**ez??R?+??a)???
Cs**;0;0;0;74;1720;

sample of dataframe:

0                                       ez??R?+??a)???Cs
1                       B?t?a?h?kwd?W-]\???fc?m[m?A}??? 
2                       ?eE????3r??c??T????fc?m[m?A}??? 
3                       ?eE????3r??c??T????fc?m[m?A}??? 
4                       ?eE????3r??c??T????fc?m[m?A}??? 
5                       B?t?a?h?kwd?W-]\???fc?m[m?A}??? 

Upvotes: 0

Views: 3805

Answers (1)

Gautam Krishna R
Gautam Krishna R

Reputation: 2665

You can simply use:

bs64 = lambda x: base64.b64decode(x)

decoded_binary_data = df['col_name'].apply(bs64)

See this page: https://chrisalbon.com/python/pandas_apply_operations_to_dataframes.html

Upvotes: 2

Related Questions