Reputation: 21

Frequency tables in python using pandas

I'm using pandas lineSer.value_counts() to make a frequency table but it doesn't display all my items. I have over 100 pieces of data and I need to see all of them

def freqTable():
    fileIn = open('data.txt','r')
    fileOut = open('dataOut.txt', 'w')
    lines = [line.strip() for line in fileIn if line.strip() and not line.startswith('com')
    lineSer = pd.Series(lines)
    freq = str(lineSer.value_counts())
    for line in freq:
        fileOut.write(line)

this is the code I'm using and I need to get rid of the '...' in the results and see all the data points. What can I do different?

Madding.     57
Crowning.    47
My.           8
And.          8
Thy.          7
Thou.         7
The.          5
To.           5
For.          5
I.            4
That.         4
In.           4
Love.         4
Is.           3
Not.          3
...
Did.          1
Shadows.      1
Of.           1
Mind,.        1
O'erlook.     1
Sometime.     1
Fairer.       1
Monsters,.    1
23.           1
Defect,.      1
Show,.        1
What's.       1
Wood.         1
So.           1
Lov'st,.      1
Length: 133, dtype: int64

Upvotes: 2

Answers (3)

jezrael

Reputation: 863541

If you need temporary show data, try option_context with display.max_rows:

#temporary print 999 rows
with pd.option_context('display.max_rows', 999):
    print freq

More info in docs.

I try modify your solutions by using functions strip and startswith for working with string data and to_csv for write output to file:

import pandas as pd
import io

temp=u"""Madding.
 Madding.
  Madding.
 Madding.
 Crowning.
   Crowning.
 com Crowning.
com My. 
  com And.
   Thy.
Thou.
The."""
#after testing replace io.StringIO(temp) to data.txt
s = pd.read_csv(io.StringIO(temp), sep="|", squeeze=True)
print s
0           Madding.
1           Madding.
2           Madding.
3          Crowning.
4          Crowning.
5      com Crowning.
6           com My. 
7           com And.
8               Thy.
9              Thou.
10              The.
Name: Madding., dtype: object

#strip data
s = s.str.strip()

#get data which starts with 'com'
print s.str.startswith('com')
0     False
1     False
2     False
3     False
4     False
5      True
6      True
7      True
8     False
9     False
10    False
Name: Madding., dtype: bool

#filter rows, which not starts width 'com'
s = s[~s.str.startswith('com')]
print s
0      Madding.
1      Madding.
2      Madding.
3     Crowning.
4     Crowning.
8          Thy.
9         Thou.
10         The.
Name: Madding., dtype: object

#count freq
freq = s.value_counts()

#temporary print 999 rows
with pd.option_context('display.max_rows', 999):
    print freq 
Madding.     3
Crowning.    2
Thou.        1
Thy.         1
The.         1
Name: Madding., dtype: int64

#write series to file by to_csv
freq.to_csv('dataOut.txt', sep=';')

Upvotes: 1

BrenBarn

Reputation: 251578

If you want to write the list to a file, don't make it into a string and write that to a file. Pandas has built-in functions for writing things to files. Just do lineSer.value_counts().to_csv('dataOut.txt'). If you want to tweak the formatting of the output, read the documentation for to_csv to see how you can customize it. (You can probably also read your data in more efficiently by using something like pandas.read_csv, but that's another topic.)

Upvotes: 3

MaxU - stand with Ukraine

Reputation: 210972

try this:

pd.options.display.max_rows = 999

Upvotes: 0

Frequency tables in python using pandas

Answers (3)

Related Questions