Reputation: 1446

remove decimal places in strings ids using awk

I want to remove the decimal places in strings from a list of identifiers:

ENSG00000166224.12
ENSG00000102897.5
ENSG00000168496.3
ENSG00000010295.15
ENSG00000147533.12
ENSG00000119242.4

My desired output will be

ENSG00000166224
ENSG00000102897
ENSG00000168496
ENSG00000010295
ENSG00000147533
ENSG00000119242

I would like to do it with awk, I have been playing with printf but with no success.

UPDATE:

The awk answer setting the field separator to . works well in files with only one column, but what if the file is composed of different columns (strings and float numbers)? Here is an example:

ENSG00000166224.12  0.0730716237772557  -0.147970450702234
ENSG00000102897.5   0.156405616866614   -0.0398488625782745
ENSG00000168496.3   -0.110396121325736  -0.0147093758392248

How can I remove only the decimal places in the first field?

Thanks

Upvotes: 2

Answers (4)

fedorqui

Reputation: 289755

You can set the field separator to the dot and print the first element:

$ awk -F. '{print $1}' file
ENSG00000166224
ENSG00000102897
ENSG00000168496
ENSG00000010295
ENSG00000147533
ENSG00000119242

In sed you would say sed 's/\.[^\.]*$//' file, which will catch everything from the last dot on and remove it.

You would be able to do it with printf if it just was a number. Then, you would use something to not print the decimal places. However, since it is an alphanumeric string it is best to handle it as a string.

Update

Use gsub to replace everything from . in the first field:

$ awk '{gsub(/\..*$/,"",$1)}1' a
ENSG00000166224 0.0730716237772557 -0.147970450702234
ENSG00000102897 0.156405616866614 -0.0398488625782745
ENSG00000168496 -0.110396121325736 -0.0147093758392248

Upvotes: 4

Govind Kailas

Reputation: 2934

If you are looking for a solution in perl

perl -pne 's/\..*$//' file.txt

This eventually remove everything after the decimal point.

Upvotes: 0

jaypal singh

Reputation: 77105

Using cut:

$ cut -d. -f1 file
ENSG00000166224
ENSG00000102897
ENSG00000168496
ENSG00000010295
ENSG00000147533
ENSG00000119242

Upvotes: 1

Avinash Raj

Reputation: 174706

use sub function also.

awk '{sub(/\..*/, "")}1' file

Upvotes: 1

remove decimal places in strings ids using awk

Answers (4)

Update

Related Questions