pandora
pandora

Reputation: 59

Searching a List of strings in a SQL column?

I am an MPH student taking an INTRO to data science class writing my second SQL code. I have created a csv file. One of the columns is a string.

title      author
------------------------------
Boo        'Harry, Joe, Nancy'
Engine     'Harry, Mike, Sue'
Pig        'Amy, Kelly, Bob'

I have uploaded this csv file into postgreSQL using python.

I need to create a query where the user enters an author name, it searches the database table, and returns all the titles by that author. There are actually 3 author columns (last, first and MI) and all three are strings with multiple authors, but I haven't tackled that problem yet.

If someone enters Harry, it should give me back the rows with Boo and Engine.

I've tried:

SELECT 
    title
FROM 
    table_name
WHERE 
    author = "Harry"

but it says Harry doesn't exist. I've also tried using 'Harry' and I got the column, but nothing in it.

I'm assuming I need to split the column with names into a list of strings? where each name is it's own string? To get:

title      author
----------------------------------
Boo        'Harry', 'Joe', 'Nancy'
Engine     'Harry', 'Mike', 'Sue'
Pig        'Amy', 'Kelly', 'Bob'

I tried using

SELECT 
    title
    authors
FROM 
    table_name
    CROSS APPLY STRING_SPLIT(authors, ',');

This says that there is a syntax error:

ERROR:  syntax error at or near "APPLY"
LINE 5:  CROSS APPLY STRING_SPlIT(author_last, ',');

I got the code from http://www.sqlservertutorial.net/sql-server-string-functions/sql-server-string_split-function/

A point in the right direction is appreciated. Thank you.

UPDATE: I Used Tim's suggestion and tried

SELECT 
    article_title, author_last
FROM paper_table
WHERE author_last ~ '\yHarry\y';

and got this pic of query I think it means it ran, but didn't find anything?

UPDATE2- Using Metal's suggstion: picture of query And I know it's there bc I can see it. pic of table

UPDATE3- using Metal's second suggestion. pic of second query

Upvotes: 0

Views: 6862

Answers (3)

user330315
user330315

Reputation:

You don't need to unnest the array, you can use a regex to split the string and get rid of the whitespace by that.

SELECT title, author
FROM the_table
WHERE 'Harry' = any(regexp_split_to_array(author, '\s*,\s*'))

Upvotes: 0

Ed Bangga
Ed Bangga

Reputation: 13006

You can use like operator.

select title from table_name where author like '%Harry,%' or author like '%,Harry%'

Above solution is to be assumed you want those comma separated list that contains 'Harry'. I would suggest below solution. Which will work same as string_split function in sql server.

select t.title 
from table_name t, unnest(string_to_array(t.author , ',')) s(val)
where trim(s.val) = 'Harry'

See dbFiddle.

Upvotes: 1

Tim Biegeleisen
Tim Biegeleisen

Reputation: 521249

You should seriously try to avoid storing CSV data in your SQL tables, as it would generally cause you trouble when you try to query (as in this question).

You could try searching for an input name using regex, with the name being surrounded by word boundaries:

SELECT title, author
FROM table_name
WHERE author ~ '\yHarry\y';

screen capture of demo below

Demo

Upvotes: 0

Related Questions