SpanishBoy
SpanishBoy

Reputation: 2225

Creating h2o dataframe from pandas and unicode error

How can I convert pandas object to h2o dataframe safely?

import h2o
import pandas as pd

df = pd.DataFrame({'col1': [1,1,2], 'col2': ['César Chávez Day', 'César Chávez Day', 'César Chávez Day']})
hf = h2o.H2OFrame(df)  #gives error

UnicodeEncodeError: 'ascii' codec can't encode character '\xe9' in position 4: ordinal not in range(128)

Environment: Python 3.5, h2o 3.10.4.2

Upvotes: 2

Views: 5995

Answers (1)

Erin LeDell
Erin LeDell

Reputation: 8819

I agree that this is not an H2O-specific issue. This works for me (same H2O and Python version):

import h2o
import pandas as pd

df = pd.DataFrame({'col1': [1,1,2], 'col2': ['César Chávez Day', 'César Chávez Day', 'César Chávez Day']})
hf = h2o.H2OFrame(df)

## -- End pasted text --
Parse progress: |█████████████████████████████████████████████████████████| 100%

In [4]: hf
Out[4]:   col1  col2
------  ----------------
     1  César Chávez Day
     1  César Chávez Day
     2  César Chávez Day

[3 rows x 2 columns]

In [5]: type('César Chávez Day')
Out[5]: str

My specs (you may need to change your default encoding):

In [6]: import sys

In [7]: sys.getdefaultencoding()
Out[7]: 'utf-8'

This thread may help: How do I check if a string is unicode or ascii?

Upvotes: 7

Related Questions