Reputation: 110257
How would I run the following query, like I would again mysql:
SELECT * FROM [integrated-myth-15.testing_data_100k]
WHERE title='down in la'
Now it will match if I have a perfect case-sensitive string match, but how would I do it case insensitive for "down in la"? I'm working from the Web console.
Upvotes: 20
Views: 42575
Reputation: 45135
Excuse me if this is way off. I have not used the product, I'm reading the docs to research it.
I've found the following which might be of use.
CONTAINS_SUBSTR
Performs a normalized, case-insensitive search to see if a value exists as a substring in an expression. Returns TRUE if the value exists, otherwise returns FALSE.
https://cloud.google.com/bigquery/docs/reference/standard-sql/string_functions#contains_substr
This is interesting because the case sensitivity seems to be built-in to the function, which tells me that there may be others which work this way, and that it'll just work the way most people would expect it to :)
COLLATE
Also, I wonder whether you can apply a collation at query time to help.
https://cloud.google.com/bigquery/docs/reference/standard-sql/collation-concepts#collate_define
-- Assume there is a table with this column declaration:
CREATE TABLE table_a
(
col_a STRING COLLATE 'und:ci',
col_b STRING COLLATE '',
col_c STRING,
col_d STRING COLLATE 'und:ci'
);
-- This runs. Column 'b' has a collation specification and the
-- column 'c' does not.
SELECT STARTS_WITH(col_b_expression, col_c_expression)
FROM table_a;
-- This runs. Column 'a' and 'd' have the same collation specification.
SELECT STARTS_WITH(col_a_expression, col_d_expression)
FROM table_a;
-- This runs. Even though column 'a' and 'b' have different
-- collation specifications, column 'b' is considered the default collation
-- because it's assigned to an empty collation specification.
SELECT STARTS_WITH(col_a_expression, col_b_expression)
FROM table_a;
-- This works. Even though column 'a' and 'b' have different
-- collation specifications, column 'b' is updated to use the same
-- collation specification as column 'a'.
SELECT STARTS_WITH(col_a_expression, COLLATE(col_b_expression, 'und:ci'))
FROM table_a;
-- This runs. Column 'c' does not have a collation specification, so it uses the
-- collation specification of column 'd'.
SELECT STARTS_WITH(col_c_expression, col_d_expression)
FROM table_a;
Upvotes: 2
Reputation: 59
The best way is to append "IGNORE CASE" at the end of your query.
SELECT * FROM [integrated-myth-15.testing_data_100k] WHERE title='down in la' IGNORE CASE
Note: This will work only with legacy SQL
As mentioned in the official documentations:
String functions operate on string data. String constants must be enclosed with single or double quotes. String functions are case-sensitive by default. You can append IGNORE CASE to the end of a query to enable case- insensitive matching. IGNORE CASE works only on ASCII characters and only at the top level of the query.
Upvotes: 4
Reputation: 33745
The standard way to do this is using LOWER
or UPPER
on the input string, e.g.:
#legacySQL
SELECT * FROM [integrated-myth-15.testing_data_100k]
WHERE LOWER(title) = 'down in la';
Or:
#standardSQL
SELECT * FROM `integrated-myth-15.testing_data_100k`
WHERE LOWER(title) = 'down in la';
Upvotes: 41