user1342546
user1342546

Reputation: 9

removing dots but not decimal dots

I have a space delimited text file that contains periods . as missing data and periods as decimal separator. I want to replace all the missing data periods with NaN and leave the decimal separators alone - here is an example:

Sample data:

1981 12 23 . 4.5 . .
1981 12 24 4.6 7.8 1.2 22.0
1981 12 25 . . . .
1981 12 26 2.1 . 3.1 .

Desired output:

1981 12 23 NaN 4.5 NaN NaN
1981 12 24 4.6 7.8 1.2 22.0
1981 12 25 NaN NaN NaN NaN
1981 12 26 2.1 NaN 3.1 NaN

Any help using sed, tr, perl in a unix environment would be much appreciated

Upvotes: 0

Views: 522

Answers (4)

potong
potong

Reputation: 58508

This might work for you:

sed ':a;s/ \. / Nan /g;ta;s/ \.$/ Nan/' file

or if numbers like .123 don't exist:

sed 's/ \./ Nan/g' file

Upvotes: 1

TLP
TLP

Reputation: 67910

Using negated look-around assertions seems to be a good idea here.

perl -plwe 's/(?<!\d)\.(?!\d)/NaN/g;' file.txt

In other words, only replace if the surrounding characters are not numbers. It might fail if you have numbers such as: .1231 (as opposed to 0.1231). In such a case, you can remove the first look-around.

Upvotes: 6

brian d foy
brian d foy

Reputation: 132896

This Perl program will do it, replacing any dot without a digit next to it:

#!/Users/brian/bin/perls/perl5.14.2

while( <DATA> ) {
    s/ (?<!\d) \. (?!\d) /NaN/xg;
    print;
    }

__END__
1981 12 23 . 4.5 . .
1981 12 24 4.6 7.8 1.2 22.0
1981 12 25 . . . .
1981 12 26 2.1 . 3.1 .

That's a short Perl one-liner:

% perl -pe 's/ (?<!\d) \. (?!\d) /NaN/xg' input_file

Upvotes: 6

SuperDisk
SuperDisk

Reputation: 109

Check if the next character after a dot is a space. If it is, add a NaN there.

Upvotes: -1

Related Questions