wisicoc
wisicoc

Reputation: 113

How to build a query with an optional filter resistant to SQL injections?

There is such a database:

enter image description here

And such a stored function:

CREATE FUNCTION fetch_mentor(direction_type_list_in text, education_type_list_in text, name_in text, city_id_in integer) RETURNS json
    LANGUAGE plpgsql
AS
$$
DECLARE
    direction_type_list_inner INT[];
    education_type_list_inner INT[];
BEGIN
    direction_type_list_inner = STRING_TO_ARRAY(direction_type_list_in, ',');
    education_type_list_inner = STRING_TO_ARRAY(education_type_list_in, ',');

    RETURN (SELECT JSON_AGG(rows)
                FROM (SELECT m.id                                     AS "ID"
                           , FORMAT('%s %s', m.firstname, m.lastname) AS "fullName"
                           , m.photo_url                              AS "photoURL"
                           , m.video_url                              AS "videoURL"
                           , (SELECT JSON_AGG(mc.name)
                                  FROM mentors.mentor_competence mc
                                  WHERE mc.mentor_id = m.id)          AS "competenceList"
                           , (SELECT JSON_AGG(el.name)
                                  FROM mentors.mentor_employment            me
                                           INNER JOIN lists.employment_list el ON el.id = me.employment_id
                                  WHERE me.mentor_id = m.id)          AS "competenceList"
                           , m.certified                              AS "certified"
                           , cl.display_name                          AS "cityName"
                           , (SELECT JSON_AGG(dtl.display_name)
                                  FROM mentors.mentor_direction_type            mdt
                                           INNER JOIN lists.direction_type_list dtl ON mdt.direction_type_id = dtl.id
                                  WHERE mdt.mentor_id = m.id)            "directionList"
                           , (SELECT JSON_AGG(etl.display_name)
                                  FROM mentors.mentor_education_type            met
                                           INNER JOIN lists.education_type_list etl ON met.education_type_id = etl.id
                                  WHERE met.mentor_id = m.id)            "educationList"
                          FROM mentors.mentor                m
                                   LEFT JOIN lists.city_list cl ON m.city_id = cl.id
                          WHERE approved = TRUE
                            AND (direction_type_list_in ISNULL OR 
                                 m.id IN (SELECT m.id
                                              FROM mentors.mentor_direction_type mdt
                                              WHERE mdt.direction_type_id = ANY (direction_type_list_inner)))
                            AND (education_type_list_inner ISNULL OR
                                 m.id IN (SELECT m.id
                                              FROM mentors.mentor_education_type met
                                              WHERE met.education_type_id = ANY (education_type_list_inner)))
                            AND (name_in ISNULL OR
                                 (m.firstname LIKE FORMAT('%%%s%%', $3) OR m.lastname LIKE FORMAT('%%%s%%', $3)))
                            AND (city_id_in ISNULL OR m.city_id = city_id_in)
                     ) rows);
END;
$$;

ALTER FUNCTION fetch_mentor(TEXT, TEXT, TEXT, INTEGER) OWNER TO postgres;

Which returns such a json:

[
    {
        "ID": 3,
        "fullName": "fsafd 413",
        "photoURL": "sadf",
        "videoURL": "dsa",
        "competenceList": [
            "a",
            "s",
            "f"
        ],
        "employmentList": [
            "a",
            "b",
            "c"
        ],
        "certified": false,
        "cityName": null,
        "directionList": [
            "x",
            "z"
        ],
        "educationList": [
            "offline"
        ]
    }
]

The following parameters come to the input:

type FetchMentorsParams struct {
    DirectionTypeList []int   `json:"directionTypeList"`
    EducationTypeList []int   `json:"educationTypeList"`
    MentorName        *string `json:"mentorName"`
}

All filters are optional. That is, if there are no filters, then all records are output. If, for example, Direction Type List = [1,2], then only those records should be returned where mentor_id in the table mentor_direction_type is in the column where direction_type_id = 1 or 2. And so on. But is there a way to form a query without a stored function that is resistant to SQL Injection? If you do something like len(direction Type List) != 0 and then add subqueries in the for loop, then the query will be vulnerable to sql injection.

And in general, how good is it to try to do everything in one query? I see an option: before the main query, make query to mentor_direction_type and to mentor_education_type and then do where mentor_id in (the result of querys).

And it is normal to make such querys as - (SELECT JSON_AGG(el.name) FROM mentors.mentor_employment me INNER JOIN lists.employment_list el ON el.id = me.employment_id WHERE me.mentor_id = m.id) AS "competenceList", I'm not sure if it's optimized, but I don't see any other option.

Im using Go, SQLX, PGX.

Upvotes: 2

Views: 719

Answers (1)

Luke
Luke

Reputation: 2454

To break down your question.

Functions

And such a stored function:

Despite writing plpgsql functions seems logical coming from other programming languages, in Postgresql these functions have a caveat, that is the planner will not be able to properly evaluate the cost of what is running inside the function. This means that as data grows, your planner will become increasingly off, thinking that your function has a somewhat low score while instead it can be potentially high. Unless your function can use one of the volatility modifiers then I would always write it as a standard query statement.

If optimisation is a concern, you can use the embed go package and work with nice and tidy .sql files :-)

Filters and SQL injection

All filters are optional.

I also appreciate the fact that you want a query "to rule them all" which is capable of accepting any set of arguments and spit out the correct output. However this can turn into a rabbit hole of evaluating all the possible combinations and scenarios.

Sometimes it is better to have a little bit of redundancy while also having smaller, better intent-defined, queries.

And it is normal to make such querys as - (SELECT JSON_AGG(el.name) FROM mentors.mentor_employment me INNER JOIN lists.employment_list el ON el.id = me.employment_id WHERE me.mentor_id = m.id) AS "competenceList", I'm not sure if it's optimized, but I don't see any other option.

Nested queries are not necessarily bad, remember that SQL that is clear to us is not always clear to the planner. Before judging the subquery, I would recommend you to run the content of your fetch_mentor function with an EXPLAIN ANALYZE, this will tell you how the planner will tackle it and it will help you understanding if there are any bottlenecks.

If you are new to EXPLAIN ANALYZE, you can copy the output of it and paste it in tools such as https://flame-explain.com/visualize/input

Upvotes: 1

Related Questions