Reputation: 147
When I try to run follwing code in spark I get the error:
Here is traceback:
TypeError Traceback (most recent call last)
<ipython-input-33-4bfff78eeaad> in <module>()
1 feature_text='lepton pT, lepton eta, lepton phi, missing energy
magnitude, missing energy phi, jet 1 pt, jet 1 eta, jet 1 phi, jet 1 b-tag,
jet 2 pt, jet 2 eta, jet 2 phi, jet 2 b-tag, jet 3 pt, jet 3 eta, jet 3 phi,
jet 3 b-tag, jet 4 pt, jet 4 eta, jet 4 phi, jet 4 b-tag, m_jj, m_jjj, m_lv,
m_jlv, m_bb, m_wbb, m_wwbb'
----> 2 features=[strip(a) for a in split(feature_text,',')]
/opt/ibm/spark/python/pyspark/sql/column.py in __iter__(self)
342
343 def __iter__(self):
--> 344 raise TypeError("Column is not iterable")
345
346 # string methods
TypeError: Column is not iterable
code:
feature_text='lepton pT, lepton eta, lepton phi, missing energy magnitude, missing energy phi, jet 1pt, jet 1 eta, jet 1 phi, jet 1 b-tag, jet 2 pt, jet 2 eta, jet 2 phi, jet 2 b-tag, jet 3 pt, jet 3 eta, jet 3 phi, jet 3 b-tag, jet 4 pt, jet 4 eta, jet 4 phi, jet 4 b-tag, m_jj, m_jjj, m_lv, m_jlv, m_bb, m_wbb, m_wwbb'
features=[strip(a) for a in split(feature_text,',')]
Upvotes: 0
Views: 237
Reputation: 4481
It looks like you are using the pyspark.sql.functions.split
function, when you're really looking for the string split
method.
Using the latter, you could generate a list of feature names, without requiring a list comprehension, from your string using:
features = feature_text.split(", ")
Both strip
and split
are methods of str
, rather than functions, so are called using my_string.strip(" ")
and my_string.split(",")
.
Upvotes: 1