Reputation: 125
I write one python 3 script
I have a column 'original_title', where I have different film titles i.a. all films of Star Wars (+ the name of the episode) and Star Trek (+ the name of the episode). I want to create one column which will show me only 'star trek' (without the name of episode), 'star wars' and 'na'.
This is my code for the new column:
df['Trek_Wars'] = pd.np.where(df.original_title.str.contains("Star Wars"), "star_wars",
pd.np.where(df.original_title.str.contains("Star Trek"), "star_trek"))
However, it doesn't work
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-33-5472b36a2193> in <module>()
1 df['Trek_Wars'] = pd.np.where(df.original_title.str.contains("Star Wars"), "star_wars",
----> 2 pd.np.where(df.original_title.str.contains("Star Trek"), "star_trek"))
ValueError: either both or neither of x and y should be given
What should I do?
Upvotes: 1
Views: 5424
Reputation: 128
As in your example both the values i.e "Star Wars" and "Star Trek" contain same number of characters (9), you can just split the string till first 9 letters. But for more finer parsing of that column you will need to find a more better method.
X['Film_Series'] = 0
for ind, row in df.iterrows():
X['Film_Series'].loc[ind] = X['film_name'].loc[ind].str[:9]
Upvotes: 0
Reputation: 164673
I assume you are using Pandas. I am not aware of a pd.np.where
method, but there is np.where
, which you can use for your task:
df['Trek_Wars'] = np.where(df['original_title'].str.contains('Star Wars'),
'star_wars', 'na')
Notice we have to provide values for when the condition is met and for when the condition is not met. For multiple conditions, you can use pd.DataFrame.loc
:
# set default value
df['Trek_Wars'] = 'na'
# update according to conditions
df.loc[df['original_title'].str.contains('Star Wars'), 'Trek_Wars'] = 'star_wars'
df.loc[df['original_title'].str.contains('Star Trek'), 'Trek_Wars'] = 'star_trek'
You can simply your logic further with a dictionary mapping:
# map search string to update string
mapping = {'Star Wars': 'star_wars', 'Star Trek': 'star_trek'}
# iterate mapping items
for k, v in mapping.items():
df.loc[df['original_title'].str.contains(k), 'Trek_Wars'] = v
Upvotes: 4