Melody
Melody

Reputation: 73

Athena query results show null values despite is not null condition in query

I have the following query which I run in Athena. I would like to receive all the results that contain a tag in the 'resource_tags_aws_cloudformation_stack_name'. However, when I run the query my results show me rows where the 'resource_tags_aws_cloudformation_stack_name' is empty and I don't know what I am doing wrong.

SELECT 
cm.line_item_usage_account_id,
         pr.line_of_business,
         cm.resource_tags_aws_cloudformation_stack_name, 
        SUM(CASE WHEN cm.line_item_product_code = 'AmazonEC2'
THEN line_item_unblended_cost * 0.97    
ELSE cm.line_item_unblended_cost END) AS discounted_cost, 
CAST(cm.line_item_usage_start_date AS DATE) AS start_day
FROM cost_management cm
JOIN prod_cur_metadata pr ON cm.line_item_usage_account_id = pr.line_item_usage_account_id
WHERE cm.line_item_usage_account_id IN ('1234504482')
AND cm.resource_tags_aws_cloudformation_stack_name IS NOT NULL
        AND cm.line_item_usage_start_date
    BETWEEN date '2020-01-01'
        AND date '2020-01-30'
GROUP BY  cm.line_item_usage_account_id,pr.line_of_business,  cm.resource_tags_aws_cloudformation_stack_name, CAST(cm.line_item_usage_start_date AS DATE), pr.line_of_business
HAVING sum(cm.line_item_blended_cost) > 0
ORDER BY  cm.line_item_usage_account_id 

Upvotes: 3

Views: 38951

Answers (2)

Melody
Melody

Reputation: 73

I modified my query to exclude ' ' and that seems to work:

SELECT 
cm.line_item_usage_account_id,
         pr.line_of_business,
         cm.resource_tags_aws_cloudformation_stack_name, 
        SUM(CASE WHEN cm.line_item_product_code = 'AmazonEC2'
THEN line_item_unblended_cost * 0.97    
ELSE cm.line_item_unblended_cost END) AS discounted_cost, 
CAST(cm.line_item_usage_start_date AS DATE) AS start_day
FROM cost_management cm
JOIN prod_cur_metadata pr ON cm.line_item_usage_account_id = pr.line_item_usage_account_id
WHERE cm.line_item_usage_account_id IN ('1234504482')
AND NOT cm.resource_tags_aws_cloudformation_stack_name = ' ' 
        AND cm.line_item_usage_start_date
    BETWEEN date '2020-01-01'
        AND date '2020-01-30'
GROUP BY  cm.line_item_usage_account_id,pr.line_of_business,  cm.resource_tags_aws_cloudformation_stack_name, CAST(cm.line_item_usage_start_date AS DATE), pr.line_of_business
HAVING sum(cm.line_item_blended_cost) > 0
ORDER BY  cm.line_item_usage_account_id 

Upvotes: 4

Deepak Kumar
Deepak Kumar

Reputation: 308

You can try space use case as below

    AND Coalesce(cm.resource_tags_aws_cloudformation_stack_name,' ') !=' '

Or if you have multiple spaces try. The below query is not good if spaces required in actual data

    AND Regexp_replace(cm.resource_tags_aws_cloudformation_stack_name,' ') is not null

Adding to this you may also have special char like CR or LF in data. Although its rare scenario

Upvotes: 1

Related Questions