Reputation: 157
How can I grab the memory usage value (displayed in the output of the funciton DataFrame.info()
and assign to a variable?
Upvotes: 9
Views: 5140
Reputation: 8631
As docs says we should have a buffer
.
buf : writable buffer, defaults to sys.stdout
For df
import io
impor pandas as pd
df=pd.DataFrame({
'someCol' : ["foo", "bar"]
})
buf = io.StringIO()
df.info(buf=buf)
info = buf.getvalue()
print(info)
Gives me output:
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 2 entries, 0 to 1
Data columns (total 1 columns):
someCol 2 non-null object
dtypes: object(1)
memory usage: 96.0+ bytes
For specific memory usage value:
info = buf.getvalue().split('\n')[-2]
print(info)
Would give the output:
memory usage: 96.0+ bytes
Upvotes: 4
Reputation: 18943
DataFrame.memory_usage().sum()
There's an example on this page:
In [8]: df.memory_usage()
Out[8]:
Index 72
bool 5000
complex128 80000
datetime64[ns] 40000
float64 40000
int64 40000
object 40000
timedelta64[ns] 40000
categorical 5800
dtype: int64
# total memory usage of dataframe
In [9]: df.memory_usage().sum()
Out[9]: 290872
Looking at the source code of df.info() shows that using memory_usage() is how they compute the actual memory usage in df.info():
... <last few lines of def info from pandas/frame.py>
mem_usage = self.memory_usage(index=True, deep=deep).sum()
lines.append("memory usage: %s\n" %
_sizeof_fmt(mem_usage, size_qualifier))
_put_lines(buf, lines)
Upvotes: 11