Amy
Amy

Reputation: 5193

How can I display full (non-truncated) dataframe information in HTML when converting from Pandas dataframe to HTML?

I converted a Pandas dataframe to an HTML output using the DataFrame.to_html function. When I save this to a separate HTML file, the file shows truncated output.

For example, in my TEXT column,

df.head(1) will show

The film was an excellent effort...

instead of

The film was an excellent effort in deconstructing the complex social sentiments that prevailed during this period.

This rendition is fine in the case of a screen-friendly format of a massive Pandas dataframe, but I need an HTML file that will show complete tabular data contained in the dataframe, that is, something that will show the latter text element rather than the former text snippet.

How would I be able to show the complete, non-truncated text data for each element in my TEXT column in the HTML version of the information? I would imagine that the HTML table would have to display long cells to show the complete data, but as far as I understand, only column-width parameters can be passed into the DataFrame.to_html function.

Upvotes: 508

Views: 763957

Answers (12)

Benjamin Ziepert
Benjamin Ziepert

Reputation: 1754

Display the full dataframe for a specific cell:

import pandas as pd
from IPython.display import display
with pd.option_context('display.max_colwidth', None,
                       'display.max_columns', None,
                       'display.max_rows', None):
    display(df)

The method above can be extended with more options.

Updated helper function from Karl Adler:

def display_full(x):
    with pd.option_context('display.max_rows', None,
                           'display.max_columns', None,
                           'display.width', 2000,
                           'display.float_format', '{:20,.2f}'.format,
                           'display.max_colwidth', None):
        display(x)

Change display options for all cells:

pd.set_option('display.max_colwidth', None)
pd.set_option('display.max_rows', None)
pd.set_option('display.max_columns', None)
display(df)

Upvotes: 22

gndps
gndps

Reputation: 821

Colab/Notebook print utils

I created some functions to help me with printing in colab. For this question's purpose, I'm using dfprint_wide from the following util snippet:

How it works for printing full df

It prints 8 columns at a time to avoid truncating. This will make sure no columns truncate.

You can optionally set pd.set_option('display.max_colwidth', None) to make sure cells don't truncate. But that makes the output less pretty (like the other answers)

There's also an option to print to an html file.

Usage

dfprint_wide(df)

Utils code

import os
import pandas as pd
from pathlib import Path

try:
    from IPython.display import display, HTML
except ImportError:
    pass

def hprint(text='', tag="p", export_file=None):
    html = f"<{tag}>{text}</{tag}>"
    if export_file:
        os.makedirs(os.path.dirname(export_file), exist_ok=True)
        with open(export_file, 'a') as f:
            f.write(html)
    else:
        try:
            display(HTML(html))
        except NameError:
            print(text)

def dfprint_wide(df, cols_per_chunk=8, export_file=None):
    num_cols = len(df.columns)
    for i in range(0, num_cols, cols_per_chunk):
        chunk_cols = df.columns[i:i+cols_per_chunk]
        dfprint(df[chunk_cols], export_file)
        hprint('', export_file)  # empty line for readability

def dfprint(data, export_file=None):
    if isinstance(data, dict):
        data = pd.DataFrame([data])
    df = pd.DataFrame(data)
    if export_file:
        os.makedirs(os.path.dirname(export_file), exist_ok=True)
        with open(export_file, 'a') as f:
            f.write(df.to_html())
    else:
        try:
            display(df)
        except NameError:
            print(df.to_string())

Upvotes: 0

himra
himra

Reputation: 1

I would like to offer other methods. If you don't want to always set it as default.

# First method
list(df.itertuples()) # This would force pandas to explicitly display your dataframe, however it's not that beautiful

# Second method
import tabulate
print(tabulate(df, tablefmt='psql', headers='keys')) 
# `headers` are your columns, `keys` are the current columns
# `psql` is one type of format for tabulate to organize before, you could pick other format you like in the documentation

Upvotes: 0

bitbang
bitbang

Reputation: 2182

Try this too:

pd.set_option("max_columns", None) # show all cols
pd.set_option('max_colwidth', None) # show full width of showing cols
pd.set_option("expand_frame_repr", False) # print cols side by side as it's supposed to be

Upvotes: 29

joelostblom
joelostblom

Reputation: 48909

Another way of viewing the full content of the cells in a Pandas dataframe is to use IPython's display functions:

from IPython.display import HTML

HTML(df.to_html())

Upvotes: 11

Colonel_Old
Colonel_Old

Reputation: 932

The following code results in the error below:

pd.set_option('display.max_colwidth', -1)

FutureWarning: Passing a negative integer is deprecated in version 1.0 and will not be supported in future version. Instead, use None to not limit the column width.

Instead, use:

pd.set_option('display.max_colwidth', None)

This accomplishes the task and complies with versions of Pandas following version 1.0.

Upvotes: 13

Prabhat
Prabhat

Reputation: 4426

For those looking to do this in Dask:

I could not find a similar option in Dask, but if I simply do this in same notebook for Pandas it works for Dask too.

import pandas as pd
import dask.dataframe as dd
pd.set_option('display.max_colwidth', -1) # This will set the no truncate for Pandas as well as for Dask. I am not sure how it does for Dask though, but it works.

train_data = dd.read_csv('./data/train.csv')
train_data.head(5)

Upvotes: 5

Karl Adler
Karl Adler

Reputation: 16796

While pd.set_option('display.max_columns', None) sets the number of the maximum columns shown, the option pd.set_option('display.max_colwidth', -1) sets the maximum width of each single field.

For my purposes I wrote a small helper function to fully print huge data frames without affecting the rest of the code. It also reformats float numbers and sets the virtual display width. You may adopt it for your use cases.

def print_full(x):
    pd.set_option('display.max_rows', None)
    pd.set_option('display.max_columns', None)
    pd.set_option('display.width', 2000)
    pd.set_option('display.float_format', '{:20,.2f}'.format)
    pd.set_option('display.max_colwidth', None)
    print(x)
    pd.reset_option('display.max_rows')
    pd.reset_option('display.max_columns')
    pd.reset_option('display.width')
    pd.reset_option('display.float_format')
    pd.reset_option('display.max_colwidth')

Upvotes: 186

behzad.nouri
behzad.nouri

Reputation: 77941

Set the display.max_colwidth option to None (or -1 before version 1.0):

pd.set_option('display.max_colwidth', None)

set_option documentation

For example, in IPython, we see that the information is truncated to 50 characters. Anything in excess is ellipsized:

Truncated result

If you set the display.max_colwidth option, the information will be displayed fully:

Non-truncated result

Upvotes: 823

iamyojimbo
iamyojimbo

Reputation: 4679

Jupyter Users

Whenever I need this for just one cell, I use this:

with pd.option_context('display.max_colwidth', None):
  display(df)

Upvotes: 105

Apostolos
Apostolos

Reputation: 3445

For those who like to reduce typing (i.e., everyone!): pd.set_option('max_colwidth', None) does the same thing

Upvotes: 1

user7579768
user7579768

Reputation: 2077

pd.set_option('display.max_columns', None)  

id (second argument) can fully show the columns.

Upvotes: 206

Related Questions