Reputation: 23
I have a dimension (a column from google sheets) called products with the following values:
product = [apple , apple_old_2019, pineapple , pineapple_old_2020, pineapple_old_2017 ...]
I need to regex then and remove the pattern old_****
and then aggregate the values by name.
In Google Sheets I would replace the values and then use the Unique formula, but in Google Data Studio there isn't such function.
I created a custom field called Product_pre with this formula:
REGEXP_EXTRACT(Product , '^(.+?)(_old_[0-9]{2}-[0-9]{4})' )
Then I created another custom field with the following formula:
CASE
WHEN Product_pre_process is null THEN Product
ELSE Product_pre_process
END
The problem is that the result has duplicated values:
product_processed = [apple , apple, pineapple , pineapple, pineapple ...]
How could I fix that?
Upvotes: 2
Views: 13868
Reputation: 6472
1) Extract the First Word
The REGEXP_EXTRACT
function below does the trick (extracting all the characters from the beginning of each string till the first instance of _
):
REGEXP_EXTRACT(Product , "^([^_]*)")
2) Consolidation
If the chart type is a Table, then removing the rest of the dimensions and leaving just the newly created dimension will result in the metric values automatically aggregating based on the two values in the dimension (apple
and pineapple
).
Google Data Studio Report as well as a GIF to visualise the above:
Upvotes: 3