Swamy
Swamy

Reputation: 342

select distinct pair of values from sqlite database

I have a sqlite database that has a table structure as follows

column 1: source ip
column 2: source port
column 3: destination ip
column 4: destination port
column 5: protocol
column 6: type

snapshot of the table looks like this

╔═══╦══════╦═══╦═════╦═════╦═══╗
║ A ║ 444  ║ B ║ 555 ║ TCP ║ 2 ║
╠═══╬══════╬═══╬═════╬═════╬═══╣
║ B ║ 555  ║ A ║ 444 ║ TCP ║ 1 ║
╠═══╬══════╬═══╬═════╬═════╬═══╣
║ C ║ 100  ║ A ║ 5525║ UDP ║ 1 ║
╠═══╬══════╬═══╬═════╬═════╬═══╣
║ B ║ 5222 ║ C ║ 2552║ TCP ║ 2 ║
╠═══╬══════╬═══╬═════╬═════╬═══╣
║ A ║ 5525 ║ C ║ 100 ║ UDP ║ 2 ║
╠═══╬══════╬═══╬═════╬═════╬═══╣
║ C ║ 2552 ║ B ║ 5222║ TCP ║ 1 ║
╚═══╩══════╩═══╩═════╩═════╩═══╝

In the above table you can see that there is entry for both (A,444) , (B,555) and (B,555), (A,444). I want to select the distinct [(source ip,source port), (destination ip,destination port), (protocol)] i.e., i want a query that gives a output as follows

╔═══╦══════╦═══╦═════╦═════╦═══╗
║ A ║ 444  ║ B ║ 555 ║ TCP ║ 2 ║
╠═══╬══════╬═══╬═════╬═════╬═══╣
║ B ║ 5222 ║ C ║ 2552║ TCP ║ 2 ║
╠═══╬══════╬═══╬═════╬═════╬═══╣
║ C ║ 100  ║ A ║ 5525║ UDP ║ 1 ║
╚═══╩══════╩═══╩═════╩═════╩═══╝

Thanks in advance

Upvotes: 3

Views: 783

Answers (1)

LS_ᴅᴇᴠ
LS_ᴅᴇᴠ

Reputation: 11151

Following will do:

SELECT *,
    source_ip<destination_ip OR source_ip=destination_ip AND source_port<destination_port AS _asc
FROM _table
GROUP BY
    CASE WHEN _asc THEN source_ip ELSE destination_ip END,
    CASE WHEN _asc THEN source_port ELSE destination_port END,
    CASE WHEN _asc THEN destination_ip ELSE source_ip END,
    CASE WHEN _asc THEN destination_port ELSE source_port END;

EDIT: Following your edit...

SELECT *
FROM (
  SELECT *,
      source_ip<destination_ip OR source_ip=destination_ip AND source_port<destination_port AS _asc
  FROM _table
  ORDER BY ROWID DESC
)
GROUP BY
    CASE WHEN _asc THEN source_ip ELSE destination_ip END,
    CASE WHEN _asc THEN source_port ELSE destination_port END,
    CASE WHEN _asc THEN destination_ip ELSE source_ip END,
    CASE WHEN _asc THEN destination_port ELSE source_port END;

MAY work, BUT:

  1. "first row" makes no sense in database context, as ROWID may change. "First" only makes sense when applied to column(s), e.g. "first port number" or "first by date add" - which means you will probably need another column.

  2. Although SQLite seams to choose last row for each group (for unaggregated expressions), there is no warranty that this behavior will be maintained or is consistent across versions.

Check in SQL Fiddle.

Upvotes: 3

Related Questions