Reputation: 59
I have a text file which contains text like:
Somename of someone 1234 7894
Even some more name 2345 5343
Even more of the same 6572 6456
I am a customer 1324 7894
I am another customer 5612 3657
Also I am a customer and I am number Three 9631 7411
And I am number four and not the latest one in list 8529 9369
And here I am 4567 9876
I need to make a CSV file from this, but the problem is that the name contains 12 columns, so I need to merge all from the first 12 columns to 1 column so the CSV file will look like:
Somename of someone,123456,789456
cut -d ' ' -f1-11 test | sed "s/[[:space:]]/\\ /g" | sed "s/\t/\\ /g" > test1
gives me a file with the first 12 columns.
Upvotes: 1
Views: 105
Reputation: 67507
a hacky awk
$ awk '{last="," $(NF-1) "," $NF; NF-=2; print $0 last}' file
Somename of someone,1234,7894
Even some more name,2345,5343
Even more of the same,6572,6456
I am a customer,1324,7894
I am another customer,5612,3657
Also I am a customer and I am number Three,9631,7411
And I am number four and not the latest one in list,8529,9369
And here I am,4567,9876
capture the last two columns, reduce column count which will also normalize spacing; then print.
Upvotes: 0
Reputation: 15482
If you don't mind using GNU AWK instead, you can do this:
gawk 'BEGIN {FIELDWIDTHS = "54 5 5"; OFS = ","} {print $1, $2, $3}' FILE
Further explanation:
FIELDWIDTHS = "54 5 5"
OFS = ","
Note that FIELDWIDTHS is a feature of GNU AWK.
If you don't mind the whitespace being preserved in our CSV, then you're done.
Or, if you also need the white space removed, then:
# test.gawk
BEGIN {
FIELDWIDTHS = "54 5 5"
OFS = ","
}
{
for (f=1; f<=NF; f++) {
sub(/ +$/, "", $f) # Delete whitespace.
}
print
}
Testing:
▶ gawk -f test.gawk FILE
Somename of someone,1234,7894
Even some more name,2345,5343
Even more of the same,6572,6456
I am a customer,1324,7894
I am another customer,5612,3657
Also I am a customer and I am number Three,9631,7411
And I am number four and not the latest one in list,8529,9369
And here I am,4567,9876
(Note that in the second version, as suggested by Ed Morton in comments, I was able to just use print
at the end, because we modified the fields which effectively updates $0
and the field separators are replaced by OFS
.)
Upvotes: 1
Reputation: 203712
With GNU sed for \s/\S shorthand for space/non-space and -E to enable EREs:
$ sed -E 's/\s+(\S+)\s+(\S+)$/,\1,\2/' file
Somename of someone,1234,7894
Even some more name,2345,5343
Even more of the same,6572,6456
I am a customer,1324,7894
I am another customer,5612,3657
Also I am a customer and I am number Three,9631,7411
And I am number four and not the latest one in list,8529,9369
And here I am,4567,9876
and the functional equivalent with any POSIX sed:
$ sed 's/[[:space:]]*\([^[:space:]]\{1,\}\)[[:space:]]*\([^[:space:]]\{1,\}\)$/,\1,\2/' file
Somename of someone,1234,7894
Even some more name,2345,5343
Even more of the same,6572,6456
I am a customer,1324,7894
I am another customer,5612,3657
Also I am a customer and I am number Three,9631,7411
And I am number four and not the latest one in list,8529,9369
And here I am,4567,9876
or with any awk:
$ awk -v OFS=',' '{x=$(NF-1) OFS $NF; sub(/([[:space:]]+[^[:space:]]+){2}$/,""); print $0, x}' file
Somename of someone,1234,7894
Even some more name,2345,5343
Even more of the same,6572,6456
I am a customer,1324,7894
I am another customer,5612,3657
Also I am a customer and I am number Three,9631,7411
And I am number four and not the latest one in list,8529,9369
And here I am,4567,9876
Upvotes: 2
Reputation: 9144
If the different columns related to the name are part of the same CSV column, and should therefore be left untouched, why not working on the last two columns only?
$ sed 's/\t* *\([0-9]\+\)\t* *\([0-9]\+\)$/,\1,\2/' input_file
Somename of someone,123456,789456
Even some more name,234567,534312
Even more of the same,657212,645613
Upvotes: 1
Reputation: 86
Can you explain are the first columns a name and how many name columns are their maximum, ahead of the numerals? Knowing that can make it relatively easy to add commas and save to a .csv.
tr -s '[:blank:]' ','
Upvotes: 0