spinon42
spinon42

Reputation: 31

Get substring into a new column

I have a table that contains a column that has data in the following format - lets call the column "title" and the table "s"

title

ab.123
ab.321 
cde.456
cde.654
fghi.789
fghi.987

I am trying to get a unique list of the characters that come before the "." so that i end up with this:

ab
cde
fghi

I have tried selecting the initial column into a table then trying to do an update to create a new column that is the position of the dot using "ss".

something like this:

t: select title from s
update thedot: (title ss `.)[0] from t

i was then going to try and do a 3rd column that would be "N" number of characters from "title" where N is the value stored in "thedot" column.

All i get when i try the update is a "type" error.

Any ideas? I am very new to kdb so no doubt doing something simple in a very silly way.

Upvotes: 3

Views: 5691

Answers (3)

nyi
nyi

Reputation: 3229

A variation of WooiKent's answer using each-right (/:) :

q)exec distinct (` vs/:x)[;0] from t
`ab`cde`fghi

Upvotes: 0

WooiKent Lee
WooiKent Lee

Reputation: 1311

the reason why you get the type error is because ss only works on string type, not symbol. Plus ss is not vector based function so you need to combine it with each '.

q)update thedot:string[title] ss' "." from t
title    thedot
---------------
ab.123   2
ab.321   2
cde.456  3
cde.654  3
fghi.789 4

There are a few ways to solve your problem:

q)select distinct(`$"." vs' string title)[;0] from t
x
----
ab
cde
fghi
q)select distinct(` vs' title)[;0] from t
x
----
ab
cde
fghi

You can read here for more info: http://code.kx.com/q/ref/casting/#vs

Upvotes: 6

Thomas Smyth
Thomas Smyth

Reputation: 5644

An alternative is to make use of the 0: operator, to parse around the "." delimiter. This operator is especially useful if you have a fixed number of 'columns' like in a csv file. In this case where there is a fixed number of columns and we only want the first, a list of distinct characters before the "." can be returned with:

exec distinct raze("S ";".")0:string title from t
`ab`cde`fghi

OR:

distinct raze("S ";".")0:string t`title
`ab`cde`fghi

Where "S " defines the types of each column and "." is the record delimiter. For records with differing number of columns it would be better to use the vs operator.

Upvotes: 0

Related Questions