Reputation: 25
The source data is like the photo. I am new to data flow and expression language. I wonder how to use regexExtract()(or any other expression function) to extract only the genres' names.
The expected output should be:
Animation
Comedy
Family
Adventure
Fantasy
...
Thanks!
Upvotes: 0
Views: 1489
Reputation: 8690
You can use this expression split(split(genres,"'name':'")[2],"'")[1]
to achieve this.
I create a csv file which contains your sample data.
Use the above expression in DerivedColumn transformation and get your expected value.
Upvotes: 1