Reputation: 403
I am working through Learn Python the Hard Way and am browsing through some code on Git Hub before moving on. I am just curious what the .N does on the line with "tm.N = 1000" and how it relates to the end of the code.
import matplotlib.pyplot as plt
import random
import pandas.util.testing as tm
tm.N = 1000
df = tm.makeTimeDataFrame()
import string
foo = list(string.letters[:5]) * 200
df['indic'] = list(string.letters[:5]) * 200
random.shuffle(foo)
df['indic2'] = foo
df.boxplot(by=['indic', 'indic2'], fontsize=8, rot=90)
plt.show()
Upvotes: 0
Views: 3261
Reputation: 1622
.N provides the number of elements in array type. For example, if you use a colormap,
plt.get_cmap('Pastel1').N
will return 9
because it consists of 9 colors whereas
plt.get_cmap('nipy_spectral').N
will return 256
Upvotes: 1
Reputation: 7443
N
is a global in the testing.py
module, that is used all around the module to test arrays and other things. Its default value is 30. E.g.
np.arange(N * K).reshape((N, K))
Series(randn(N), index=index)
In the code you're posting it have poor usage, because makeTimeDataFrame
can be feed with a nper
parameter that end up being substituted by N
if nper
is not provided. This is the correct usage, that would not confuse you:
df = tm.makeTimeDataFrame(nper=1000)
Upvotes: 2
Reputation: 805
It makes a timeseries of length 1000.
>>> df.head()
Out[7]:
A B C D
2000-01-03 -0.734093 -0.843961 -0.879394 0.415565
2000-01-04 0.028562 -1.098165 1.292156 0.512677
2000-01-05 1.135995 -0.864060 1.297646 -0.166932
2000-01-06 -0.738651 0.426662 0.505882 -0.124671
2000-01-07 -1.242401 0.225207 0.053541 -0.234740
>>> len(df)
Out[8]: 1000
Upvotes: 1
Reputation: 1356
In pandas in the module pandas.util.testing the N property means TimeSeries See this reference in the section:
We could alternatively have used the unit testing function to create a TimeSeries of length 20:
>>>> pandas.util.testing.N = 20
>>>> ts = pandas.util.testing.makeTimeSeries()
Upvotes: 1
Reputation: 5682
You can get information about pandas.util.testing.N
from the docstring and the type() function:
>>> tm.N.__doc__
'int(x[, base]) -> integer\n\nConvert a string or number to an integer, if possible. A floating point\nargument will be truncated towards zero (this does not include a string\nrepresentation of a floating point number!) When converting a string, use\nthe optional base. It is an error to supply a base when converting a\nnon-string. If base is zero, the proper base is guessed based on the\nstring content. If the argument is outside the integer range a\nlong object will be returned instead.'
>>> print(tm.N.__doc__)
int(x[, base]) -> integer
Convert a string or number to an integer, if possible. A floating point
argument will be truncated towards zero (this does not include a string
representation of a floating point number!) When converting a string, use
the optional base. It is an error to supply a base when converting a
non-string. If base is zero, the proper base is guessed based on the
string content. If the argument is outside the integer range a
long object will be returned instead.
>>> type(tm.N)
<type 'int'>
Upvotes: 1
Reputation: 519
Source: https://github.com/pydata/pandas/blob/master/pandas/util/testing.py
N is a variable in the pandas.util.testing library (imported as tm
). It's used in a few of the functions defined in that library, including the makeTimeSeries
function called in the getTimeSeriesData
which is in turn called in the makeTimeDataFrame
function that you call with df = tm.makeTimeDataFrame()
Upvotes: 1
Reputation: 57490
The previous line, import pandas.util.testing as tm
, imports the module pandas.util.testing
and, for convenience, gives it the name tm
. Thus, tm
afterwards refers to this module, and so tm.N
refers to the object named "N
" (whatever that is) in the module.
Upvotes: 2