Dileesh Dil
Dileesh Dil

Reputation: 199

count of one specific item of panda dataframe

i have used following line to get the count of number of

"Read" s from the specific column (containing READ,WRITE,NOP)of a file . which is not csv file but a .out file with \t as delimiter.

    data = pd.read_csv('xaa',usecols=[1], header=None,delimiter='\t')
    df2=df1.iloc[start:end,]

    count=df2.str.count("R").sum()

I am getting error

AttributeError:

'DataFrame' object has no attribute 'str'

But when i use

 if filename.endswith(".csv"): 
        data = pd.read_csv(filename)
df1=data.loc[:,"operation"]
df2=df1.iloc[start:end,] 
count=df2.str.count("R").sum()

There is no error. But here i have to enter in each csv file.I have to open the file and insert "operation" in the column I need. KIndly give a soultion

Upvotes: 2

Views: 81

Answers (2)

jezrael
jezrael

Reputation: 862611

I believe need select column 1 for Series, else get one column DataFrame:

count=df2[1].str.count("R").sum()

Or compare by eq and sum of Trues:

count=df2[1].eq("R").sum()

EDIT:

Another solution is return Series in read_csv by parameter squeeze:

s = pd.read_csv('xaa',usecols=[1], header=None,delimiter='\t', squeeze=True)

count=s.iloc[start:end].str.count("R").sum()

#for another solution
#count=s.iloc[start:end].eq("R").sum()

Sample:

df2 = pd.DataFrame({1:['R','RR','Q']})
print (df2)
    1
0   R
1  RR
2   Q

#count all substrings
count=df2[1].str.count("R").sum()
print (count)
3

#count only strings
count=df2[1].eq("R").sum()
print (count)
1

Upvotes: 1

zipa
zipa

Reputation: 27869

Just add 0 to df2 assignment:

data = pd.read_csv('xaa',usecols=[1], header=None,delimiter='\t')
df2=df1.iloc[start:end, 0]

count=df2.str.count("R").sum()

And I think it should be:

df2 = data.iloc[start:end, 0]

But maybe you have some other steps that create df1.

Upvotes: 0

Related Questions