user1542036
user1542036

Reputation: 453

using levenshtein distance ratio to compare 2 records

I've created the mysql user function using the levenshtein distance and ratio source codes. I am comparing 2 records and based on a 75% match I want to select the record.

  1. Order comes into table paypal_ipn_orders with an ITEM title
  2. A query executes against a table itemkey to find a 75% match in a record called ITEM as well
  3. if a 75% title is match it assigns an eight digit number from table itemkey to table paypal_ipn_orders

Here is the query

      UPDATE paypal_ipn_orders
      SET sort_num = (SELECT sort_id
      FROM itemkey
      WHERE  levenshtein_ratio(itemkey.item, paypal_ipn_orders.item_name) > 75)
      WHERE packing_slip_printed = 0 
      AND LOWER(payment_status) = 'completed' 
      AND address_name <> ''
      AND shipping < 100

I have adjusted this a few times but it's failing between line 4 and 5 at the levenshtein_ratio part. If it works it says that the subquery returns more than one row. I don't know how to fix it to make it return the correct result, I just lost as to how to make this work.

Upvotes: 0

Views: 1986

Answers (1)

Andrew
Andrew

Reputation: 4624

A subquery on a SET should only return one value. If itemkey has more than one item that is 75% of item_name what do you want to do? The below will use one of the best matches:

UPDATE paypal_ipn_orders
SET sort_num = (SELECT sort_id
  FROM itemkey
  WHERE  levenshtein_ratio(itemkey.item, paypal_ipn_orders.item_name) > 75
  ORDER BY levenshtein_ratio(itemkey.item, paypal_ipn_orders.item_name) DESC
  LIMIT 1)
WHERE packing_slip_printed = 0 
  AND LOWER(payment_status) = 'completed' 
  AND address_name <> ''
  AND shipping < 100

Upvotes: 2

Related Questions