Reputation: 8347
Pandas offers some summary statistics with the describe()
function called on a DataFrame
. The output of the function is another DataFrame
, so it's easily exported to HTML with a call to to_html()
.
It also offers information about the DataFrame
with the info()
function, but that's printed out, returning None
. Is there a way to get the same information as a DataFrame
or any other way that can be exported to HTML?
Here is a sample info()
for reference:
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 5 entries, 0 to 4
Data columns (total 7 columns):
0 5 non-null float64
1 5 non-null float64
2 5 non-null float64
3 5 non-null float64
4 5 non-null float64
5 5 non-null float64
6 5 non-null float64
dtypes: float64(7)
memory usage: 360.0 bytes
Upvotes: 7
Views: 3023
Reputation: 8347
With input from all these great answers, I ended up doing the following:
datatype
in the snippet below) into a pandas' DataFrame using StringIO
So there result is this:
def process_content_info(content: pd.DataFrame):
content_info = StringIO()
content.info(buf=content_info)
str_ = content_info.getvalue()
lines = str_.split("\n")
table = StringIO("\n".join(lines[3:-3]))
datatypes = pd.read_table(table, delim_whitespace=True,
names=["column", "count", "null", "dtype"])
datatypes.set_index("column", inplace=True)
info = "\n".join(lines[0:2] + lines[-2:-1])
return info, datatypes
Perhaps the second StringIO can be simplified, but anyway this achieves what I needed.
Upvotes: 1
Reputation: 19957
import StringIO
output = StringIO.StringIO()
#Write df.info to a string buffer
df.info(buf=output)
#put the info back to a dataframe so you can use df.to_html()
df_info = pd.DataFrame(columns=['DF INFO'], data=output.getvalue().split('\n'))
df_info.to_html()
Upvotes: 1
Reputation: 7496
A solution can be to save the output of info() to a writable buffer (using the buf argument) and then converting to html.
Below an example using a txt file as buffer, but this could be easily done in memory using StringIO
.
import pandas as pd
import numpy as np
frame = pd.DataFrame(np.random.randn(100, 3), columns =['A', 'B', 'C'])
_ = frame.info(buf = open('test_pandas.txt', 'w')) #save to txt
# Example to convert to html
contents = open("test_pandas.txt","r")
with open("test_pandas.html", "w") as e:
for lines in contents.readlines():
e.write("<pre>" + lines + "</pre> <br>\n")
Here's how the txt looks like:
The variation using StringIO can be found in @jezrael answer, so probably no point updating this answer.
Upvotes: 1
Reputation: 863301
I try rewrite another solution with StringIO
, also is necessary use getvalue()
with split
:
from pandas.compat import StringIO
frame = pd.DataFrame(np.random.randn(100, 3), columns =['A', 'B', 'C'])
a = StringIO()
frame.info(buf = a)
# Example to convert to html
contents = a.getvalue().split('\n')
with open("test_pandas.html", "w") as e:
for lines in contents:
e.write("<pre>" + lines + "</pre> <br>\n")
Upvotes: 1