Reputation: 103
Code snippet 1
import pandas as pd
df = pd.read_csv("filename.txt", sep='\t', header = 0, names = ['E', 'S', 'D'])
Result = df.query(df.E.head(**n=100**) == 0)
Code Snippet 1
works as expected and returns a dataframe
with df.E
value equal to 0
.
But,
Code Snippet 2
import pandas as pd
df = pd.read_csv("filename.txt", sep='\t', header = 0, names = ['E', 'S', 'D'])
Result = df.query(df.E.head(**n=101**) == 0)
Code Snippet 2 does not work and throws error as
"SyntaxError: ('invalid syntax', ('<unknown>', 1, 602, '[True ,True ,True ,True ,True ,True ,True ,True ,True ,True ,True ,True ,True ,True ,True ,True ,... ,True ,True ,True ,True ,True ,True ,True ,True ,True ,True ,True ,True ,True ,True ,True ,...]\n'))"
Please note that only change between 2 sets of code is n=100
and n=101
.
The error is still present with .head(n=101)
removed. Have tried for many values greater than 100
, throws same error.
Upvotes: 1
Views: 3826
Reputation: 129018
df.query accepts a string query. you are not passing valid python (it accepts a slight superset of python actually). so I wouldn't expect either of your code snippets to work at all, hence the SyntaxError.
Straight out of the doc-string
Parameters
----------
expr : string
The query string to evaluate. You can refer to variables
in the environment by prefixing them with an '@' character like
``@a + b``.
In [14]: pd.set_option('max_rows',10)
In [15]: np.random.seed(1234)
In [16]: df = DataFrame(np.random.randint(0,10,size=100).reshape(-1,1),columns=list('a'))
In [17]: df
Out[17]:
a
0 3
1 6
2 5
3 4
4 8
.. ..
95 9
96 2
97 9
98 1
99 3
[100 rows x 1 columns]
In [18]: df.query('a==3')
Out[18]:
a
0 3
21 3
26 3
28 3
30 3
32 3
51 3
60 3
99 3
In [19]: var = 3
In [20]: df.query('a==@var')
Out[20]:
a
0 3
21 3
26 3
28 3
30 3
32 3
51 3
60 3
99 3
Upvotes: 1