I have requirement of generate file (individual file) based on the first column values . Source File is like Batch No ; Values ------------------- B1 ; TEST1 B1 ; TEST2 B1 ; TEST3 B2 ; DELTA1 B2 ; DELTA2 expected output would be two files B1.txt (this contains) TEST1 TEST2 TEST3 B2.txt (this contains) DELTA1 DELTA1 The source file data will vary every time but structure (the number of columns and delimiter) will be consistent Any suggestion?

Reputation: 29

Concatenate second column rows for the matching values in First column and generate file

I have requirement of generate file (individual file) based on the first column values .

Source File is like

Batch No ; Values 
-------------------
B1       ; TEST1
B1       ; TEST2
B1       ; TEST3
B2       ; DELTA1
B2       ; DELTA2

expected output would be two files

B1.txt (this contains)
```
TEST1
TEST2
TEST3
```
B2.txt (this contains)
```
DELTA1
DELTA1
```

The source file data will vary every time but structure (the number of columns and delimiter) will be consistent

Any suggestion?

Upvotes: 0

Answers (3)

nlsdkd

Reputation: 121

#!/bin/bash
while read -r file data; do
    echo "${data#; *}" >> "$file".txt
done < <(sed -n -e '1,/^---/!p' source_file)

Upvotes: 0

Manuel Schmidt

Reputation: 2417

You can use awk for this.

awk -F" *; *" 'NR > 2{print $2 > $1 ".txt"}' source.txt

This asumes your source file is source.txt and contains

Batch No ; Values 
-------------------
B1       ; TEST1
B1       ; TEST2
B1       ; TEST3
B2       ; DELTA1
B2       ; DELTA2

The result are two files: B1.txt and B2.txt

-F" *; *": Defines the field separator. In this case a semicolon and any number of spaces before and after the semicolon.
NR > 2: You skip first two rows.
$1, $2 stores the content of the first and second columns respectively.
With > you redirect the output of the print command to the corresponding file instead of printing to the standard output.

Upvotes: 0

Marc Beaulieu

Reputation: 1

This is a one liner with awk:

tail -n +3 test.txt | awk '{print $3>$1".txt"}'

The tail -n +3 removes the first 2 lines of header; the awk scripts prints the value of the 3rd field in a file named after the first field concatenated with .txt

The default awk field separator is spaces, so in your example, the ; is field #2 which we drop. The input file does not even have to be sorted by the first field for this to work.

Upvotes: 0

Concatenate second column rows for the matching values in First column and generate file

Answers (3)

Related Questions