Utsav
Utsav

Reputation: 5918

select only those columns from table have not null values in q kdb

I have a table:

q)t:([] a:1 2 3; b:```; c:`a`b`c)
a b c
-----
1   a
2   b
3   c

From this table I want to select only the columns who have not null values, in this case column b should be omitted from output.(something similar to dropna method in pandas).

expected output
a c
---
1 a
2 b
3 c

I tried many things like

select from t where not null cols

but of no use.

Upvotes: 4

Views: 3402

Answers (3)

terrylynch
terrylynch

Reputation: 13572

A modification of Sander's solution which handles string columns (or any nested columns):

q)t:([] a:1 2 3; b:```; c:`a`b`c;d:"   ";e:("";"";"");f:(();();());g:(1 1;2 2;3 3))

q)t
a b c d e  f g
----------------
1   a   ""   1 1
2   b   ""   2 2
3   c   ""   3 3

q)where[{$[type x;all null x;all 0=count each x]}each flip t]_t
a c g
-------
1 a 1 1
2 b 2 2
3 c 3 3

Upvotes: 4

Callum Biggs
Callum Biggs

Reputation: 1540

The nature of kdb is column based, meaning that where clauses function on the rows of a given column.

To make a QSQL query produce your desired behaviour, you would need to first examine all your columns and determine which are all null, and then feed that into a functional statement. Which would be horribly inefficient.

Given that you need to fully examine all the columns data regardless (to check if all the values are null) the following will achieve that

q)@[flip;;enlist] k!d k:key[d] where not all each null each value d:flip t
a c
---
1 a
2 b
3 c

Here I'm transforming the table into a dictionary, and extracting its values to determine if any columns consist only of nulls (all each null each). I'm then applying that boolean list to the keys of the dictionary (i.e., the column names) through a where statement. We can then reindex into the original dictionary with those keys and create a subset dictionary of non-null columns and convert that back into a table.

I've generalized the final transformation back into a table by habit with an error catch to ensure that the dictionary will be converted into a table even if only a single row is valid (preventing a 'rank error)

Upvotes: 1

Here is a simple solution that does just what you want:

q)where[all null t]_t
a c
---
1 a
2 b
3 c

[all null t] gives a dictionary that checks if the column values are all null or not.

q)all null t
a| 0
b| 1
c| 0

Where returns the keys of the dictionary where it is true

q)where[all null t]
,`b

Finally you use _ to drop the columns from table t

Hopefully this helps

Upvotes: 7

Related Questions