sleepy_daze
sleepy_daze

Reputation: 551

Converting substring to DATE with str_to_date

I've got probably what is a very simple issue. However, I've researched extensively and haven't found a solution yet. Basically, I want to convert a modified string variable into the DATE format (specifically %Y).

I have a column variable called dob, which includes dates in the VARCHAR format. The values of these strings vary and can look like any of the following: 01 JAN 1900, ABT 1960, or Unknown. Nonetheless, the year is always the last four digits, so I'm grabbing the year by creating a substring. But I want to convert that substring into a YEAR format. My thought is that I need to use str_to_date to accomplish this.

This is my MySQL query:

SELECT dob, STR_TO_DATE(SUBSTRING(dob, -4), "%Y") as YEAR
FROM person_table;

Upon running it, I only get NULL values. Is there something I'm missing?

Here are my MySQL specs:

innodb_version: 5.7.20
protocol_version: 10

Thanks for your help!

Edit: Providing SQL Mode Information:

+---------------+-----------------------------------------------------------+ 
| Variable_name | Value |
+ --------------+-----------------------------------------------------------+
| sql_mode      | ONLY_FULL_GROUP_BY,STRICT_TRANS_TABLES,NO_ZERO_IN_DATE,NO_ZERO_DATE,ERROR_FOR_DIVISION_BY_ZERO,NO_AUTO_CREATE_USER,NO_ENGINE_SUBSTITUTION |
+---------------+------------------------------------------------------------
1 row in set (0.00 sec)

Upon running the query SELECT STR_TO_DATE('2009','%Y'); I get NULL and the following warning:

I get the following warning:
+---------+------+-----------------------------------------------------------+
| Level   | Code | Message |
+---------+------+-----------------------------------------------------------+
| Warning | 1411 | Incorrect datetime value: '2009' for function str_to_date |
+---------+------+-----------------------------------------------------------+
1 row in set (0.00 sec)

Upvotes: 0

Views: 1056

Answers (1)

Schwern
Schwern

Reputation: 165606

str_to_date returns a date type, and only a year is an incomplete date. str_to_date will fill in the incomplete parts with zeroes unless you have no_zero_dates mode enabled. This is part of Strict SQL Mode which is the default in MySQL 8.0; it avoids the worst of MySQL's quirks.

Strict mode controls how MySQL handles invalid or missing values in data-change statements such as INSERT or UPDATE. A value can be invalid for several reasons. For example, it might have the wrong data type for the column, or it might be out of range

Without Strict SQL Mode, MySQL will turn "2009" into the invalid date 2009-00-00.

mysql> set sql_mode = '';
Query OK, 0 rows affected (0.00 sec)

mysql> SELECT STR_TO_DATE('2009','%Y');
+--------------------------+
| STR_TO_DATE('2009','%Y') |
+--------------------------+
| 2009-00-00               |
+--------------------------+
1 row in set (0.00 sec)

With Strict SQL Mode it will not.

mysql> show variables like 'sql_mode';
+---------------+-----------------------------------------------------------------------------------------------------------------------+
| Variable_name | Value                                                                                                                 |
+---------------+-----------------------------------------------------------------------------------------------------------------------+
| sql_mode      | ONLY_FULL_GROUP_BY,STRICT_TRANS_TABLES,NO_ZERO_IN_DATE,NO_ZERO_DATE,ERROR_FOR_DIVISION_BY_ZERO,NO_ENGINE_SUBSTITUTION |
+---------------+-----------------------------------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)

mysql> SELECT STR_TO_DATE('2009','%Y');
+--------------------------+
| STR_TO_DATE('2009','%Y') |
+--------------------------+
| NULL                     |
+--------------------------+
1 row in set, 1 warning (0.00 sec)

mysql> show warnings;
+---------+------+-----------------------------------------------------------+
| Level   | Code | Message                                                   |
+---------+------+-----------------------------------------------------------+
| Warning | 1411 | Incorrect datetime value: '2009' for function str_to_date |
+---------+------+-----------------------------------------------------------+
1 row in set (0.00 sec)

To solve your problem, instead of trying to do a one-size-fits-all conversion, I would recommend trying several formats from most to least specific using coalesce. You will have to add the missing date parts as needed.

select coalesce(
  str_to_date(dob, '%d %b %Y'),
  str_to_date(concat(dob, '-01-01'), 'ABT %Y-%m-%d')
)
from person_table;

As this is very ugly to do, I also would recommend adding a proper date column and doing an update to convert from the messy string dates to proper dates. Then query the new column going forward.

alter table person_table add column dob_date date;

update person_table
set dob_date = coalesce(
  str_to_date(dob, '%d %b %Y'),
  str_to_date(concat(dob, '-01-01'), 'ABT %Y-%m-%d')
)
where dob_date is null;

You can then check for people with a null dob_date, examine their dob field, and adapt your conversion. Iterate as needed.


UPDATE

To add, yes, I need the year 2020 as opposed to the string. The reason being is because I need to compare the year values.

As strings they will not compare as you need. Strings compare character by character. '200' is greater than the string '1999'.

mysql> select '1999' < '2000';
+-----------------+
| '1999' < '2000' |
+-----------------+
|               1 |
+-----------------+
1 row in set (0.00 sec)

mysql> select '1999' < '200';
+----------------+
| '1999' < '200' |
+----------------+
|              1 |
+----------------+
1 row in set (0.00 sec)

You need to cast them to signed integers.

mysql> select cast("1999" as signed) < cast('2000' as signed);
+-------------------------------------------------+
| cast("1999" as signed) < cast('2000' as signed) |
+-------------------------------------------------+
|                                               1 |
+-------------------------------------------------+
1 row in set (0.00 sec)

mysql> select cast("1999" as signed) < cast('200' as signed);
+------------------------------------------------+
| cast("1999" as signed) < cast('200' as signed) |
+------------------------------------------------+
|                                              0 |
+------------------------------------------------+
1 row in set (0.01 sec)

So your query would be...

select dob, cast(substring(dob, -4) as signed) as year
from person_table;

Upvotes: 2

Related Questions