Reputation: 4054
╔═══════════════╗
║ dummy_data ║
╠═══════════════╣
║ XXX_1234_YYYY ║
║ XXX_5678_YYYY ║
║ XXX_9101_YYYY ║
╚═══════════════╝
╔═══════════════╦═════════════╦═════════════╗
║ dummy_data ║ with_prefix ║ with_suffix ║
╠═══════════════╬═════════════╬═════════════╣
║ XXX_1234_YYYY ║ Pre_1234 ║ 1234_Suf ║
║ XXX_5678_YYYY ║ Pre_5678 ║ 5678_Suf ║
║ XXX_9101_YYYY ║ Pre_9101 ║ 9101_Suf ║
╚═══════════════╩═════════════╩═════════════╝
Currently I am trying to write a Presto Query
to use on Amazon Athena
,
select
dummy_data,
'Pre_' || split(dummy_data,'_')[2] as with_prefix,
split(dummy_data,'_')[2] || '_Suf' as with_suffix
from dummy_table
the above query causes split(dummy_data,'_')[2]
to be computed two times. If there a way to avoid the double computation ? Also does Presto figure out that its the same computation and avoids recomputing ?
I am trying to avoid using sub-query or With clause.
Upvotes: 0
Views: 360
Reputation: 20770
Currently Presto does not de-decuplicate sub-expression calculations. I created feature request for this: https://github.com/prestosql/presto/issues/1070.
You can see that the expression is evaluated more than once by checking the EXPLAIN
output (see the issue for example).
As you know, you can force de-duplication by using an inline view:
SELECT
dummy_data,
'Pre_' || sub_data AS with_prefix,
sub_data || '_Suf' AS with_suffix
FROM (
SELECT
dummy_data,
split(dummy_data,'_')[2] AS sub_data
FROM (VALUES 'XXX_1234_YYYY') t(dummy_data)
);
Upvotes: 2