Logarithm of a pandas series/dataframe

Question

In short: How can I get a logarithm of a column of a pandas dataframe? I thought numpy.log() should work on it, but it isn't. I suspect it's because I have some NaNs in the dataframe?

My whole code is below. It may seem a bit chaotic, basically my ultimate goal (a little exaggerated) is to plot different rows of different selected columns in several selected columns into several subplots (hence the three embedded for loops iterating between different groups... if you suggest a more elegant solution, I will appreciate it but it is not the main thing that's pressing me). I need to plot a logarithm of some values from one dataframe + 1 versus some values of the other dataframe. And here is the problem, on the plotting line with np.log I get this error: AttributeError: 'float' object has no attribute 'log' (and if I use math instead of np, I get this: TypeError: cannot convert the series to ) What may I do about it?

Thank you. Here is the code:

import numpy as np
import math
import pandas as pd
import matplotlib.pyplot as plt

hf = pd.DataFrame({'Z':np.arange(0,100,1),'A':(10*np.random.rand(100)), 'B':(10*np.random.rand(100)),'C':(10*np.random.rand(100)),'D':(10*np.random.rand(100)),'E':(10*np.random.rand(100)),'F':(10*np.random.rand(100))})
df = pd.DataFrame({'Z':np.arange(0,100,1),'A':(10*np.random.rand(100)), 'B':(10*np.random.rand(100)),'C':(10*np.random.rand(100)),'D':(10*np.random.rand(100)),'E':(10*np.random.rand(100)),'F':(10*np.random.rand(100))})
hf.loc[0:5,'A']=np.nan
df.loc[0:5,'A']=np.nan
hf.loc[53:58,'B']=np.nan
df.loc[53:58,'B']=np.nan
hf.loc[90:,'C']=np.nan
df.loc[90:,'C']=np.nan
I = ['A','B']
II = ['C','D']
III = ['E','F']
IV = ['F','A']
runs = [I,II,III,IV]
inds = [10,20,30,40]

fig = plt.figure(figsize=(6,4))
for r in runs:
    data = pd.DataFrame(index=df.index,columns=r)
    HF = pd.DataFrame(index=hf.index,columns=r)
    #pdb.set_trace()
    for i in r:
        data.loc[:,i] = df.loc[:,i]
        HF.loc[:,i] = hf.loc[:,i]
        for c,z in enumerate(inds):
            ax=fig.add_subplot()
            ax = plt.plot(math.log1p(HF.loc[z]),Tdata.loc[z],linestyle=":",marker="o",markersize=5,label=inds[c].__str__())
# or the other version
#plt.plot(np.log(1 + HF.loc[z]),Tdata.loc[z],linestyle=":",marker="o",markersize=5,label=inds[c].__str__())

As @Jason pointed out, this answer did the trick! Thank you!

Logarithm of a pandas series/dataframe

Answers (1)

Related Questions