Xty83a
Xty83a

Reputation: 95

Search for all combinations of first, middle and last name

In my database there is a table called author, which 4 columns:

A user, for example, is searching for Edgar Allan Poe. In our table Edgar Allan Poe is saved as: firstName- Edgar, middleName - Allan and lastName - Poe. This query is pretty straightforward. But how to write a query that matches not only Edgar Allan Poe, but also Poe Allan Edgar, Edgar Poe, Allan Poe, Edgar Allan, Allan Edgar Poe without writing all these possible combinations on our own? Also when the user is searching, he/she is searching as 'Edgar Allan Poe' or 'Poe Allan Edgar' altogether, not in a separate fields.

Upvotes: 6

Views: 1606

Answers (3)

toonice
toonice

Reputation: 2236

Please try the following...

DROP PROCEDURE IF EXISTS SimilarNames;
DELIMITER //
    CREATE PROCEDURE SimilarNames( authorFullName VARCHAR( 250 ) )
    BEGIN
        SET @authorFullNameCommad = CONCAT( '\'',
                                            REPLACE( authorFullName,
                                                     ' ',
                                                     '\', \'' ),
                                            '\'' );

        SET @selectStatementString := CONCAT( "SELECT authorID,",
                                              "       firstName,",
                                              "       middleName,",
                                              "       lastName ",
                                              "FROM author ",
                                              "WHERE ( ( firstName IN ( ",
                                              @authorFullNameCommad,
                                              " ) ) + ( middleName IN ( ",
                                              @authorFullNameCommad,
                                              " ) ) + ( lastName IN ( ",
                                              @authorFullNameCommad,
                                              " ) ) ) >=2;" );

        PREPARE selectStatement FROM @selectStatementString;
        EXECUTE selectStatement;
        DEALLOCATE PREPARE selectStatement;
   END //
DELIMITER ;
CALL SimilarNames( 'Edgar Allan Poe' );

This solution starts by creating a PROCEDURE called SimilarNames (after DROPping any existing versions of the PROCEDURE). This PROCEDURE stores the name passed to it ( such as 'Edgar Allan Poe') in the parameter variable authorFullName.

Once begun, the PROCEDURE starts by converting a string such as Edgar Allan Poe into 'Edgar', 'Allan', 'Poe' and storing it in the variable @authorFullNameCommad.

The CONCAT() function is then used to form the text of the SQL statement that will produce our results. Where authorFullName is Edgar Allan Poe the following statement is produced and stored in @selectStatementString...

SELECT authorID,
       firstName,
       middleName,
       lastName
FROM author
WHERE ( ( firstName IN ( 'Edgar', 'Allan', 'Poe' ) ) + ( middleName IN ( 'Edgar', 'Allan', 'Poe' ) ) + ( lastName IN ( 'Edgar', 'Allan', 'Poe' ) ) ) >=2;

The SQL statement is then PREPAREd and EXECUTEd, thus producing the desired list when the PROCEDURE is called, which may be done using...

CALL SimilarNames( 'Edgar Allan Poe' );

Please note that you do not have to declare the PROCEDURE after doing so for the first time. i.e. The following will work just fine...

CALL SimilarNames( 'Edgar Allan Poe' );
CALL SimilarNames( 'James Tiberius Kirk' );

Also, please note that this particular method is vulnerable to SQL Injection. I can develop a version protected against that if you wish - it's just late at the moment and I will be going to bed shortly.

My statement was tested against a sample dataset created using the following script...

CREATE TABLE author
(
    authorID     INT   NOT NULL AUTO_INCREMENT,
    firstName    VARCHAR( 50 ),
    middleName   VARCHAR( 50 ),
    lastName     VARCHAR( 50 ),
    PRIMARY KEY ( authorID )
);
INSERT INTO author ( firstName,
                     middleName,
                     lastName )
VALUES ( 'Edgar', 'Allan', 'Poe' ),
       ( 'Poe', 'Allan', 'Edgar' ),
       ( 'Edgar', 'Poe', '' ),
       ( 'Edgar', '', 'Poe' ),
       ( '', 'Edgar', 'Poe' ),
       ( 'Allan', 'Poe', '' ),
       ( 'Edgar', 'Allan', '' ),
       ( 'Allan', 'Edgar', 'Poe' ),
       ( 'Edgar', 'Allan', 'Allan' ),
       ( 'James', 'Tiberius', 'Kirk' ),
       ( 'Karl', 'Ignatius', 'von Bach' ),
       ( 'Edgar', 'Poe', 'xyz' ),
       ( 'Allanah', 'Poelsen', '' );

The results were as I expected.

If you have any questions or comments, then please feel free to post a Comment accordingly.

Further Reading

https://dev.mysql.com/doc/refman/5.7/en/call.html (on MySQL's CALL statement)

https://dev.mysql.com/doc/refman/5.7/en/string-functions.html#function_concat (on MySQL's CONCAT() function)

https://dev.mysql.com/doc/refman/5.7/en/create-procedure.html (on MySQL's CREATE PROCEDURE statement)

https://dev.mysql.com/doc/refman/5.7/en/deallocate-prepare.html (on MySQL's DEALLOCATE statement)

https://dev.mysql.com/doc/refman/5.7/en/stored-programs-defining.html (on MySQL's DELIMITER command)

https://dev.mysql.com/doc/refman/5.7/en/drop-procedure.html (on MySQL's DROP PROCEDURE statement)

https://dev.mysql.com/doc/refman/5.7/en/execute.html (on MySQL's EXECUTE statement)

https://dev.mysql.com/doc/refman/5.7/en/comparison-operators.html#function_in (on MySQL's IN operator)

https://dev.mysql.com/doc/refman/5.7/en/prepare.html (on MySQL's PREPARE statement)

https://dev.mysql.com/doc/refman/5.7/en/string-functions.html#function_replace (on MySQL's REPLACE() function)

https://dev.mysql.com/doc/refman/5.7/en/set-statement.html (on MySQL's SET statement)

Upvotes: 2

Gordon Linoff
Gordon Linoff

Reputation: 1269623

In any database, you can do something like this:

where firstName in ('Edgar', 'Allan', 'Poe') and
      middleName in ('Edgar', 'Allan', 'Poe') and
      middleName <> firstName and
      lastname in ('Edgar', 'Allan', 'Poe') and
      lastname not in (firstname, middleName)

This is actually pretty easy to extend to more names, if you like -- assuming the names are distinct as in your example (if you want to allow authors with duplicated names, just remove lines 3 and 5 from the above query).

However, depending on your database, you might want to use regular expressions or full text search instead.

Upvotes: 3

paxdiablo
paxdiablo

Reputation: 881293

For a generic DBMS, assuming your intent is just to simplify your queries, one way would be to add a separate computed column to the author table, one which is triggered on insert and update.

In other words, have a clumn called searchFullName and have it set to:

<space><firstName><space><secondName><space><lastName><space>

Then you query becomes a (relatively) simple:

select something from author
where searchFullName like '% Poe %'
  and searchFullName like '% Edgar %'

It won't be blindingly fast for large tables but it should achieve what you want.

Upvotes: 0

Related Questions