Populate MYSQL Column from specific rows in other table

Question

I am new to SQL (using mySQL Community Workbench) and not sure where to begin with this problem.

Here is the overview: I have two tables in my food database: branded_food and food_nutrient

The important columns in branded_food are fdc_id and kcals.

The important columns in food_nutrient are fdc_id, nutrient_id, and value

branded_food's fdc_id column indexes into food_nutrient's fdc_id column. However, this returns every nutrient in the food, when I only want nutrient id 208's value entry.

Here is an example:

branded_food looks like:

fdc_id | kcals
-----------------
123    |  (Empty)
456    |  (Empty)
...    |  (Empty)

food_nutrient looks like:

fdc_id | nutrient_id | value
----------------------------
123    |  203        | 23
123    |  204        | 25
123    |  ...        | ...
123    |  208        | 500

Essentially, I would like to write some sort of loop that goes through each fdc_id in branded_food, finds the row in food_nutrient that has fdc_id equal to the looped value, and then populate the kcals in the row of the fdc_id in branded_food. Thus the first example row should populate like:

fdc_id | kcals
-----------------
123    |  500

As an update, I have looked at INNER JOIN and have created this:

SELECT food_nutrient.amount,food_branded_food.description, food_branded_food.fdc_id
FROM food_nutrient
INNER JOIN food_branded_food ON food_nutrient.fdc_id = food_branded_food.fdc_id 
WHERE food_nutrient.nutrient_id = 208
LIMIT 1;

This will correctly display the kcals of the food_branded_food.description (the name of the food) that has fdc_id of food_branded_food.fdc_id. I limit to 1 because the query takes very long. Is there a better way?

Update #2: Here is something I recently tried, but just spins forever:

UPDATE backup_branded_food bf
INNER JOIN (
    SELECT food_nutrient.fdc_id,food_nutrient.amount amt FROM food_nutrient WHERE food_nutrient.nutrient_id = 208
            ) mn ON bf.fdc_id = mn.fdc_id
SET bf.kcals = mn.amt
WHERE bf.kcals IS NULL;

Running explain:

And SHOW CREATE TABLE food_nutrient

| food_nutrient | CREATE TABLE `food_nutrient` (
  `id` bigint DEFAULT NULL,
  `fdc_id` bigint DEFAULT NULL,
  `nutrient_id` bigint DEFAULT NULL,
  `amount` bigint DEFAULT NULL,
  `data_points` bigint DEFAULT NULL,
  `derivation_id` bigint DEFAULT NULL,
  `min` double DEFAULT NULL,
  `max` double DEFAULT NULL,
  `median` double DEFAULT NULL,
  `loq` text,
  `footnote` text,
  `min_year_acquired` text
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci |

Running SHOW CREATE TABLE backup_branded_food (I use a backup of branded food instead of the actual table)

| backup_branded_food | CREATE TABLE `backup_branded_food` (
  `fdc_id` bigint DEFAULT NULL,
  `data_type` text,
  `description` text,
  `food_category_id` bigint DEFAULT NULL,
  `publication_date` text,
  `brand_owner` varchar(255) DEFAULT NULL,
  `brand_name` varchar(255) DEFAULT NULL,
  `serving_size` double DEFAULT NULL,
  `serving_size_unit` varchar(50) DEFAULT NULL,
  `kcals` double DEFAULT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci |

FanoFN · Accepted Answer

Table Indexes:
The table structure info obtained from SHOW CREATE TABLE table_name shows that both table don't have any indexes and/or primary key. This is probably why your query runs very slow. To quickly fix this issue, let's start by adding indexes on columns appear in WHERE and ON (in the JOIN):

ALTER TABLE food_nutrient 
      ADD INDEX fdc_id(fdc_id), 
      ADD INDEX nutrient_id(nutrient_id);

ALTER TABLE branded_food 
      ADD INDEX fdc_id(fdc_id);

With these indexes added, the EXPLAIN shows the following:

id	select_type	table	partitions	type	possible_keys	key	key_len	ref	rows	filtered	Extra
1	SIMPLE	fn		ref	fdc_id,nutrient_id	nutrient_id	9	const	1	100.00	Using where
1	SIMPLE	bf		ref	fdc_id	fdc_id	9	db_40606077.fn.fdc_id	1	100.00

Since I don't know the size of the table, I can't really test how quick the query will be after adding these indexes but I assume that this will improve the query speed significantly.

P/S: Normally, you would have at least 1 column assigned as PRIMARY KEY - which will never have any duplicates. In your table food_nutrient, there's an id column that might be the PRIMARY KEY but there's a possible unique combination between fdc_id and nutrient_id. Therefore, you might consider adding UNIQUE KEY on those two columns apart from adding PRIMARY KEY on id. 24.6.1 Partitioning Keys, Primary Keys, and Unique Keys

Usage of aliases:
This is to help make your query more readable. You didn't use any in your current query so you end up appending the full table name on column(s) that you're using in you operations:

....
FROM food_nutrient AS fn 
INNER JOIN food_branded_food fbf /*can simply be written without "..AS.."*/
ON fn.fdc_id = fbf.fdc_id /*the operation afterwards didn't require you to append full table name*/
...

Similarly, once you've added the table alias, you can use it in SELECT too:

SELECT fn.amount, fbf.description, 
       fbf.fdc_id AS 'FBF_id' 
          /*you can also assign a custom/desired alias to your column - as your output column name*/
...

Couldn't find official documentation on MySQL website but here's a further explanation from a different site.

Alternative UPDATE syntax:
Your current UPDATE query should be able to perform what you need but you probably don't need the subquery at all. This UPDATE query should work as well:

UPDATE branded_food bf 
 JOIN food_nutrient fn ON bf.fdc_id = fn.fdc_id
 SET bf.kcals = fn.amount 
WHERE fn.nutrient_id = 208
AND bf.kcals IS NULL;

Here's a demo fiddle for reference

Populate MYSQL Column from specific rows in other table

Answers (2)

Related Questions