Reputation: 353
I have made a shell script that is supposed to extract data with certain field names and put them in a CSV file.
An example input file may have the following lines:
user_name: [email protected]
EMAIL: [email protected]
FIRST_NAME: jonathan
LAST_NAME: doestein
CREATION_DATE: 2013-08-01 01:08:52
REGISTRATION_STATUS: Y
VENDOR: vendorname
This will repeat itself 'n' times.
This is an excerpt of the script I wrote so far:
#!/bin/sh
echo "Please enter input file name."
read input_variable
echo "You entered: $input_variable"
echo "Please enter a name of the new output file."
read output_file
touch $output_file
echo "The output file name is going to be $output_file"
echo "Extracting files..." ;
awk '$1 ~ /^(user_name:|EMAIL:|FIRST_NAME:|LAST_NAME:|CREATION_DATE:|REGISTRATION_STATUS:)$/{printf "%s,",$2} $1 ~ /REGISTRATION_STATUS:/{print $2}' $input_variable >> $output_file.ib ;
However, although data prints to my output file, which must be a .csv extension for a GUI to view, when I open the file in a GUI such as OpenOffice Calc, there are many rows concatenated in the same row, while other lines appear to start a new line like they are supposed to.
For example, the one line might look like the following:
[email protected],noreally51,noway,username,username...x40 or so
usnername,username,username.... what this means is that it just lists about 40-50 usernames all in one row, then goes to the next line finally and prints information.
I would like to add column names to the output file:
VENDOR,user_name,FIRST_NAME,LAST_NAME,CREATION_DATE,REGISTRATION_STATUS
I can't figure out how to do that.
Thank you for your time and all of your support!
I edited my script as follows:
#!/bin/sh
echo "Please enter input file name."
read input_variable
echo "You entered: $input_variable"
echo "Please enter a name of the new output file."
touch output_file
read $output_file
echo "The output file name is going to be $output_file"
echo "Processing data extraction..." ;
awk -F": " n=25 -v 'NR<=n {h[NR-1]=$1} {a[NR%n-1]=$2} $1~/VENDOR/ && !hp{for(k=0;k<n;k++) printf "%s ", h[k] $input_variable && print "";hp=1} $1~/VENDOR/{for(k=0;k<n;k++) printf "%s ", a[k] && print ""}' data | column -t $input_variable ;
echo "Done."
This at least prints data to the $output_file. However, the data in the $output_file looks like:
??ࡱ?;?? ????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????Root Entry????????????????????????????????????????????????????????????????
@karakfa
This is the contents of the script I have. I noticed that more than the first line of your script in your answer changed. So, I amended my script to the following:
#!/bin/sh
echo "Please enter input file name."
read input_variable
echo "You entered: $input_variable"
echo "Please enter a name of the new output file."
touch output_file
read $output_file
echo "The output file name is going to be ${output_file}"
echo "Processing data extraction..." ;
cat $input_variable | awk -F": " -v OFS="," -v n=25
'NR<=n{sub(/^ */,"",$1);h[NR-1]=$1}
{a[(NR-1)%n]=$2}
$1~/VENDOR/ && !hp{line=h[0];
for(k=1;k<n;k++) line=line OFS h[k];
print line;hp=1
}
$1~/VENDOR/{line=a[0];
for(k=1;k<n;k++) line=line OFS a[k];
print line}' $input_variable ;
echo "Done."
The output was:
Please enter input file name.
inputfile.txt
You entered: allgmail.com_accounts.txt
Please enter a name of the new output file.
outputfile.csv
The output file name is going to be
Processing data extraction...
awk: no program given
./scriptname: line 23: NR<=n{sub(/^ */,"",$1);h[NR-1]=$1}
{a[(NR-1)%n]=$2}
$1~/VENDOR/ && !hp{line=h[0];
for(k=1;k<n;k++) line=line OFS h[k];
print line;hp=1
}
$1~/VENDOR/{line=a[0];
for(k=1;k<n;k++) line=line OFS a[k];
print line}: No such file or directory
Done.
I did not find any articles about 'awk: no program given' error. Do you know what I am doing incorrectly?
I noticed that where it says 'line 23', so line 23 is the following:
print line}' $input_variable ;
Then, I noticed that it also says the following on the last line:
print line}: No such file or directory
This occurs with or without 'cat $input_variable |' before awk. Normally, awk works fine on my OS. It is a Mac 10.11.1 (15B42). Is #!/bin/sh incorrect?
I look forward to your thoughts. Thank you!
Upvotes: 0
Views: 12089
Reputation: 67467
If all your fields are always present, you can try the following awk
script. The number of fields is set as a variable (7 in this case) and "VENDOR" is used as last field of the record indicator.
UPDATE: didn't notice the csv output
$ awk -F": " -v OFS="," -v n=7
'NR<=n{sub(/^ */,"",$1);h[NR-1]=$1}
{a[(NR-1)%n]=$2}
$1~/VENDOR/ && !hp{line=h[0];
for(k=1;k<n;k++) line=line OFS h[k];
print line;hp=1
}
$1~/VENDOR/{line=a[0];
for(k=1;k<n;k++) line=line OFS a[k];
print line}' inputfilename
user_name,EMAIL,FIRST_NAME,LAST_NAME,CREATION_DATE,REGISTRATION_STATUS,VENDOR
[email protected],[email protected],jonathan,doestein,2013-08-01 01:08:52,Y,vendorname
Building the header during the first n lines, when done print header once and each record when the final field is seen.
to move the last field to first you can change the code as
line=h[n-1];
for(k=1;k<n-1;k++) line=line OFS h[k];
for both occurrences (change the array name from "h" to "a" in the second instance).
Upvotes: 2
Reputation:
why dont you use echo before awk ?
echo ENDOR,user_name,FIRST_NAME,LAST_NAME,CREATION_DATE,REGISTRATION_STATUS > file
Upvotes: 2