wsha
wsha

Reputation: 964

Table Filter in Lua for multidimensional tables

I am designing a generic table filter which can remove entries from a given table, the problem is that keys are not unique to accomplish this and types are also different.

Let me elaborate more clearly with an example

SampleTable = {
    { key1 = 10, key2 = 'name_1', Str = 'sample_string_1'     },
    { key1 = 10, key2 = 'name_3', Str = 'sample_string_2'     },
    { key1 = 11, key2 = 'name_2', Mac = {'ID1', 'ID2', 'ID3'} },
    { key1 = 11, key2 = 'name_2', Mac = {'ID1'}               },
    { key1 = 12, key2 = 'name_4', Mac = {'ID2', 'ID3'}        }
}

function filter(inputTable, ...) 
    filterItems = {...}
end

I want to pass any number of keys to filter this table

local key1 = 11
local Mac = 'ID1'
filter(SampleTable, key1, Mac)
 -- Should return -> { key1 = 11, key2 = 'name_2', Mac = 'ID1'},


key1 = 12
Mac = 'ID3'
filter(SampleTable, key1, Mac)
-- Should return -> { key1 = 12, key2 = 'name_4', Mac = ID3'}

key1 = 11
Mac = 'ID2'
filter(SampleTable, key1, Mac)
 -- Should return both    
-- { key1 = 11, key2 = 'name_2', Mac = ID2'},
-- { key1 = 11, key2 = 'name_5', Mac = ID2'},

key1 = 10
Str = 'sample_string_2'
filter(SampleTable, key1, Str)
 -- Should return { key1 = 10, key2 = 'name_3', Str = 'sample_string_2'}

My current solution is search through each key,value pair in both tables

function filter(tIn, tFilter) 
    local retain = true
    local exist  = nil
    local tOut = tIn
    local _findInTable = function (t, k, v)
        if(not t[k]) then return true
        elseif(t[k] and t[k] == v) then return true
        else return false end
    end

    for i, t in ipairs (tIn) do
        for k,v in pairs (tFilter) do
            exist = _findInTable(t, k, v)
            retain = retain and exist
        end
        if not retain then tOut[i] = nil end
        retain = true
    end
    return tOut
end

local myTable = filter(SampleTable, {key1 = 11, Mac = 'ID1'})

The problem is I cannot foresee how recursion will help. This piece of code works when I have the following SampleTable, As you can see Mac is not a sub-table for my code.

SampleTable = {
    { key1 = 10, key2 = 'name_1', Str = 'sample_string_1'     },
    { key1 = 10, key2 = 'name_3', Str = 'sample_string_2'     },
    { key1 = 11, key2 = 'name_2', Mac = 'ID1'                 }
    -- { key1 = 11, key2 = 'name_2', Mac = {'ID1', 'ID2', 'ID3'} },
    -- { key1 = 11, key2 = 'name_2', Mac = {'ID1'}               },
    -- { key1 = 12, key2 = 'name_4', Mac = {'ID2', 'ID3'}        }
}

Upvotes: 2

Views: 3083

Answers (1)

viluon
viluon

Reputation: 124

From your question, it is not entirely clear whether you are dealing with a recursive schema (arbitrarily deep and branched structures), or whether the provided sample is all there is to it (keys are always assigned one to n values, no recursion in the schema). Without a more complex sample, I decided to implement this for the simpler case.


Here's my solution to your problem, including your sample:

local sample = {
    { key1 = 10, key2 = 'name_1', Str = 'sample_string_1'     },
    { key1 = 10, key2 = 'name_3', Str = 'sample_string_2'     },
    { key1 = 11, key2 = 'name_2', Mac = {'ID1', 'ID2', 'ID3'} },
    { key1 = 11, key2 = 'name_2', Mac = {'ID1'}               },
    { key1 = 12, key2 = 'name_4', Mac = {'ID2', 'ID3'}        }
}

--- Check if a row matches the specified key constraints.
-- @param row The row to check
-- @param key_constraints The key constraints to apply
-- @return A boolean result
local function filter_row(row, key_constraints)
    -- Loop through all constraints
    for k, v in pairs(key_constraints) do
        if v and not row[k] then
            -- The row is missing the key entirely,
            -- definitely not a match
            return false
        end

        -- Wrap the key and constraint values in arrays,
        -- if they're not arrays already (so we can loop through them)
        local actual_values = type(row[k]) == "table" and row[k] or {row[k]}
        local required_values = type(v) == "table" and v or {v}

        -- Loop through the values we *need* to find
        for i = 1, #required_values do
            local found
            -- Loop through the values actually present
            for j = 1, #actual_values do
                if actual_values[j] == required_values[i] then
                    -- This object has the required value somewhere in the key,
                    -- no need to look any farther
                    found = true
                    break
                end
            end

            if not found then
                return false
            end
        end
    end

    return true
end

--- Filter an array, returning entries matching `key_values`.
-- @param input The array to process
-- @param key_values A table of keys mapped to their viable values
-- @return An array of matches
local function filter(input, key_values)
    local result = {}

    for i = 1, #input do
        local row = input[i]
        if filter_row(row, key_values) then
            result[#result + 1] = row
        end
    end

    return result
end

And here's the sample output, with a utility deep_print() function:

--- Recursively print out a Lua value.
-- @param value The value to print
-- @param indent Indentation level (defaults to 0)
-- @param no_newline If true, won't print a newline at the end
local function deep_print(value, indent, no_newline)
    indent = indent or 0

    if type(value) == "table" then
        print("{")
        for k, v in pairs(value) do
            io.write(string.rep(" ", indent + 2) .. "[")
            deep_print(k, indent + 2, true)
            io.write("] = ")
            deep_print(v, indent + 2, true)
            print(";")
        end
        io.write(string.rep(" ", indent) .. "}")
    elseif type(value) == "string" then
        io.write(("%q"):format(value))
    else
        io.write(tostring(value))
    end

    if not no_newline then
        print()
    end
end

-- The following is a mix of Lua code
-- and the script's output

deep_print(filter(sample, {key1 = 10}))
-- outputs
{
  [1] = {
    ["key1"] = 10;
    ["key2"] = "name_1";
    ["Str"] = "sample_string_1";
  };
  [2] = {
    ["key1"] = 10;
    ["key2"] = "name_3";
    ["Str"] = "sample_string_2";
  };
}

deep_print(filter(sample, {key2 = "name_4"}))
-- outputs
{
  [1] = {
    ["key1"] = 12;
    ["key2"] = "name_4";
    ["Mac"] = {
      [1] = "ID2";
      [2] = "ID3";
    };
  };
}

deep_print(filter(sample, {Mac = {"ID2", "ID3"}}))
-- outputs
{
  [1] = {
    ["key1"] = 11;
    ["key2"] = "name_2";
    ["Mac"] = {
      [1] = "ID1";
      [2] = "ID2";
      [3] = "ID3";
    };
  };
  [2] = {
    ["key1"] = 12;
    ["key2"] = "name_4";
    ["Mac"] = {
      [1] = "ID2";
      [2] = "ID3";
    };
  };
}

deep_print(filter(sample, {Mac = {"ID2"}}))
-- also outputs
{
  [1] = {
    ["key1"] = 11;
    ["key2"] = "name_2";
    ["Mac"] = {
      [1] = "ID1";
      [2] = "ID2";
      [3] = "ID3";
    };
  };
  [2] = {
    ["key1"] = 12;
    ["key2"] = "name_4";
    ["Mac"] = {
      [1] = "ID2";
      [2] = "ID3";
    };
  };
}

-- Specifying multiple keys works too:
deep_print(filter(sample, {key2 = "name_3", Str = "sample_string_2"}))
-- outputs
{
  [1] = {
    ["key1"] = 10;
    ["key2"] = "name_3";
    ["Str"] = "sample_string_2";
  };
}

As you can see, filter_row() is non-recursive. It only operates on the row-level keys. This should work if your schema is flat, as seen in the sample. If the data you want to filter is actually more complicated, please provide more examples.

The operation of key comparison is made simpler by first wrapping the values in arrays (tables). This allows a uniform comparison approach for all possible cases (with a bit of extra overhead).

This is my very first answer on SO, please edit/comment if there's anything wrong with it. Thanks.

Upvotes: 2

Related Questions