Reputation: 11
I am pulling the data with the below code.
imdb_movie_data <-read.csv("https://raw.githubusercontent.com/sundeepblue/movie_rating_prediction/master/movie_metadata.csv")
Now I want to remove the last term from each movie_title and for which I wrote the following code.
substr(imdb_movie_data, 1, (nchar(imdb_movie_data$movie_title)-1))
But this is not removing the last character from the columns. Let me know if anyone needs any clarification on this.
Upvotes: 0
Views: 6510
Reputation: 59
The Easy way to go with this would be to us regex expressions.The following command could help-
imdb_movie_data$movie_title<-str_extract_all(imdb_movie_data$movie_title,"[A-Z a-z]+")
You end up getting all the characters other than the any special character.
Upvotes: 1
Reputation: 263481
Two problems:
1) imdb_movie_data$movie_title is not a character vector, but is rather a factor vector so needs to be converted to a character value with as.character
2) You need to assign a value to imdb_movie_data$movie_title
if you want the conversion to have lasting effect:
imdb_movie_data$movie_title <- substr(as.character(imdb_movie_data$movie_title),
start= 1,
stop= nchar(as.character(imdb_movie_data$movie_title) )-1 )
> head(imdb_movie_data$movie_title)
[1] "Avatar "
[2] "Pirates of the Caribbean: At World's End "
[3] "Spectre "
[4] "The Dark Knight Rises "
[5] "Star Wars: Episode VII - The Force Awakens "
[6] "John Carter "
In R the mere act of running a function has no effect on the arguments to the function. You need assignment back to the original vector if you want to make a change in values.
Upvotes: 1