LuleSa
LuleSa

Reputation: 145

How to insert a comma after 4 digits for all number with more than 8 digits

I have csv-file that looks like this:

12625,6475,387,-388,-332,-217,-104,17,125,160,121,38,-101,-282,-368
-2675,6475,420,-385,-330,-217,-106,16,124,158,120,37,-104,-281,-365
2725,6475,633,-377,-327,-222,-117,6,113,148,109,26,-114,-282,-359
-12775,6475,927,-367,-324,-229,-133,-9,99,134,95,11,-128,-283,-351
12825,64751200,-357,-320,-236,-147,-23,86,121,82,-3,-140,-283,-344
          ^ missing comma

In some rows I have the problem shown in the last row of the example, where a comma is missing between the second and third column. I know from the data that the most digits a legitimate entry can have is 5 (in some cases with a - in front) and all entries that have 8 digits originate from missing commas, which should appear after the fourth digit.

I am looking from an expression - presumably with sed - that inserts a comma after the fourth digit of all 8-digit numbers in the file.

What I have so far is

echo "12356" | sed 's/\B[0-9]\{3\}/&,/g'

which will insert a comma after four digits. How can filter such that this will only happen for 8-digit numbers, not for 5-digit numbers.

I am also open to any more elegant way that might exist to solve that problem.

Thank you

Upvotes: 0

Views: 364

Answers (2)

sseLtaH
sseLtaH

Reputation: 11207

Try this sed

sed -E 's/([0-9]{4})([0-9]{4})/\1,\2/g'

Upvotes: 2

Andrej Podzimek
Andrej Podzimek

Reputation: 2778

Because sed has already been mentioned, here’s some awk

awk -F, -vOFS=, '{
  for (i = 1; i <= NF; ++i)
    if (length($i) >= 8)
      $i = substr($i, 1, 4) "," substr($i, 5)
} 1' < some_file.csv

…and here’s some pure Bash, for no good reason:

(
IFS=,
while read -ra line; do
  for i in "${!line[@]}"; do
    ((${#line[i]} >= 8)) && line[i]="${line[i]::4},${line[i]:4}"
  done
  printf '%s\n' "${line[*]}"
done
) < some_file.csv

Upvotes: 1

Related Questions