Teresa
Teresa

Reputation: 11

pandas does not read a excel file fine after I've save a Datetime cell

I am reading a file with pandas read_excel

df0 = pd.read_excel(cfname,'Calendari', index_col=0, parse_cols='A:R',
            converters={'Categoria':str,'Fase':str, 'Grupo':str, 'Jornada':str,  
                               'Partido':str, 'IdPartido':str,'Leido':str})
df0.head()

The output is:

Linea Categoria Fase Grupo Jornada Partido IdPartido Leido Data                 Hora   
    2       CCM    1     1       1      -1       NaN    No Jornada 1            NaN  
    3       CCM    1     1       1       0       NaN    No Data                 Hora    
    4       CCM    1     1       1       1 CCM110101    No 2017-09-30 00:00:00  19:30:00  
    5       CCM    1     1       1       2 CCM110102    No 2017-09-30 00:00:00  17:45:00  
    6       CCM    1     1       1       3 CCM110103    No 2017-09-27 00:00:00  18:00:00

I write a new date with openpyxl: in line 6 I write 2017-10-01

from openpyxl import load_workbook
wb = load_workbook(filename = cfname, keep_vba=True)
ws = wb.get_sheet_by_name('Calendari')
cell = ws['I6']
cell.value = datetime.strptime('01/10/2017', "%d/%m/%Y")
cell.number_format = 'dd/mm/yyyy'
wb.save(filename = cfname)

In Excel, the date seems well saved, with Date Format. But when I try to read the file again with pandas read_excel, the data has been change
but the values for Jornada Partido IdPartido becomes NaN

Linea Categoria Fase Grupo Jornada Partido IdPartido Leido Data                 Hora   
    2       CCM    1     1     NaN     NaN       NaN    No Jornada 1            NaN  
    3       CCM    1     1     NaN     NaN       NaN    No Data                 Hora    
    4       CCM    1     1     NaN     NaN       NaN    No 2017-09-30 00:00:00  19:30:00  
    5       CCM    1     1     NaN     NaN       NaN    No 2017-09-30 00:00:00  17:45:00  
    6       CCM    1     1     NaN     NaN       NaN    No 2017-10-01 00:00:00  18:00:00

I do not know why. I try with converters, without them, diferent date format, ...
Please, can anyone help me?

Upvotes: 0

Views: 355

Answers (2)

Teresa
Teresa

Reputation: 11

I following doing tests. Ultimately I have discovered that if I open and save the excel file out of python (with Excel program), when I try to read the file again with pandas already does not give me the mistake.

I do not understand nothing. Something does not close fine with

wb.save(filename = cfname)

It will be because my Excel file is a macro xlxm? I open it with:

wb = load_workbook(filename = cfname, keep_vba=True)

Upvotes: 0

Charlie Clark
Charlie Clark

Reputation: 19507

If you have already opened a file with openpyxl then you can use the builtin utilities for converting it to Pandas. http://openpyxl.readthedocs.io/en/default/pandas.html

As the values have already been converted to Python objects the conversion Pandas is quick and reliable.

Upvotes: 1

Related Questions