data princess
data princess

Reputation: 1158

Why can't BigQuery cast this number as an integer?

In my query, I have a value formatted as a dollar amount, like this:

Coverage_Amount
$10,000
$15,000
null
$2,000

So I remove the extra characters and map the null to 0. I get a column back like this:

Coverage_Amount
10000
15000
0
2000

However, these values are stored as strings, and when I try something like this:

CASE
  WHEN Coverage_Amount IS NOT NULL THEN INTEGER(REGEXP_REPLACE(query.Coverage_Amount, r'\$|,', ''))
  ELSE 0
END AS Coverage_Amount

I get back

Coverage_Amount
null
null
0
null

The documentation for the INTEGER() function says

Casts expr to a 64-bit integer. Returns NULL if is a string that doesn't correspond to an integer value.

Is there anything I can do to make BigQuery recognize that these are in fact integers?

Upvotes: 0

Views: 9980

Answers (3)

Vasily  Bronsky
Vasily Bronsky

Reputation: 433

It could be because you have spaces after 0 at the end of string.

I mean f.e. '&10000 '. So you can try to use RTRIM(value, ' ')

SELECT 
  Coverage_Amount, 
  IFNULL(INTEGER(REGEXP_REPLACE(RTRIM(Coverage_Amount, ' '), r'\$|,', '')),0) AS val
FROM   
  (SELECT '$10,000 ' Coverage_Amount)

to delete all spaces from the end of string

Then output will be:

 Row    Coverage_Amount    val   
 1      $10,000         10000

Upvotes: 1

Mikhail Berlyant
Mikhail Berlyant

Reputation: 173171

Both below versions for BigQuery (respectivelly Legacy SQL and StandardSQL) work and return below result

Coverage_Amount   val    
10000           10000    
15000           15000    
2000             2000    

Legacy SQL

#legacySQL
SELECT 
  Coverage_Amount, 
  IFNULL(INTEGER(REGEXP_REPLACE(Coverage_Amount, r'\$|,', '')), 0) AS val
FROM   
  (SELECT '10000' Coverage_Amount),
  (SELECT '15000' Coverage_Amount),
  (SELECT '2000' Coverage_Amount) 

Standard SQL

#standardSQL
WITH `project.dataset.table` AS (
  SELECT '10000' Coverage_Amount UNION ALL
  SELECT '15000' UNION ALL
  SELECT '2000' 
)
SELECT 
  Coverage_Amount, 
  IFNULL(CAST(REGEXP_REPLACE(Coverage_Amount, r'\$|,', '') AS INT64), 0) AS val
FROM `project.dataset.table`

Obviously, same works for '$15,000' and '$10,000' and '$2,000' etc.

Upvotes: 3

Willian Fuks
Willian Fuks

Reputation: 11797

Are you using Standard? This worked for me (notice I use the CAST operator):

WITH data as(
select "$10,000" d UNION ALL
select "$15,000" UNION ALL
select "$2,000")

SELECT
d,
CAST(REGEXP_REPLACE(d, r'\$|,', '') AS INT64) AS Coverage_Amount
FROM data

Upvotes: 1

Related Questions