exiang
exiang

Reputation: 559

mysql filter with dynamic json key

I need to select SUM of all value which is realized (=true) from the mysql data set below. In this example, I should get 1000 only.

Please note that the key is dynamic (year\month\day).

I am able to get a list of value by running the sql below:

SELECT (JSON_EXTRACT(json_value, "$**.value")) AS total FROM milestone

However, i can't seem to get the filter part works with this:

JSON_CONTAINS_PATH(json_value, 'all', "$**.realized") IS NOT NULL

id=1, column json_value

{
  "2018": {
    "5": {
      "4": {
        "value": "5000"
      }
    },
    "12": {
      "4": {
        "value": "80000",
        "realized": "false"
      }
    }
  }
}

id=2, column json_value

{
  "2016": {
    "12": {
      "4": {
        "value": "1000",
        "realized": "true"
      }
    }
  }
}

Upvotes: 1

Views: 2761

Answers (2)

Bill Karwin
Bill Karwin

Reputation: 562871

Here's a query that works on MySQL 5.7:

SELECT SUM(
  JSON_UNQUOTE(
    JSON_EXTRACT(
      json_value,
      CONCAT(
        SUBSTRING_INDEX(
          JSON_UNQUOTE(JSON_SEARCH(json_value, 'all', 'true')), '.', 4),
        '.value'
      )
    )
  )
) AS sum
FROM milestone 
WHERE SUBSTRING_INDEX(
    JSON_UNQUOTE(JSON_SEARCH(json_value, 'all', 'true')),
  '.', -1) = 'realized'

This is quite complex, and hard to develop and maintain. Also it might not handle cases where you have multiple realized: true entries in a given JSON document. Or if there are other JSON keys besides "realized" that have the value "true". It will be quite tricky to even think of the edge cases, let alone handle them in code.

Out of curiosity, why not store this data as a traditional table?

CREATE TABLE milestone (
  milestone_id INT NOT NULL,
  date DATE NOT NULL,
  value INT NOT NULL,
  realized BOOL,
  PRIMARY KEY (milestone_id, date)
);

INSERT INTO milestone VALUES
(1, '2018-05-04',  5000, NULL),
(1, '2018-12-04', 80000, false),
(2, '2016-12-04',  1000, true);

Then it's simpler to do your query:

SELECT SUM(value) FROM milestone WHERE realized = true;

I'm curious because I'm seeing more people using JSON in MySQL for cases where it would be easier to use normal tables and column. Easier to write the query, easier to understand the query, and more efficient to store the data and optimize the query performance.

Using JSON in the way you are doing makes queries more difficult than they should be.

Upvotes: 1

wchiquito
wchiquito

Reputation: 16569

In modern versions of MySQL (>= 8.0.4) the query would be relatively simple (see JSON_TABLE):

SELECT
  `milestone`.`id`,
  SUM(`der`.`value`) `total`
FROM
  `milestone`,
  JSON_TABLE(
    JSON_ARRAY(`json_value`),
    '$[*]' COLUMNS(
      NESTED PATH '$**.*' COLUMNS(
        `value` DECIMAL(10, 2) PATH '$.value',
        `realized` VARCHAR(5) PATH '$.realized'
      )
    )
  ) `der`
WHERE
  `der`.`value` IS NOT NULL AND
  `der`.`realized` = 'true'
GROUP BY
  `milestone`.`id`;

See db-fiddle.

Upvotes: 2

Related Questions