bluethundr
bluethundr

Reputation: 1345

Account For Variable Output to CSV File in Bash

I'm getting some info about AWS instances in our environment and dumping them to a CSV file.

Problem is some lines have a tag named "owner" and some do not.

For the lines with an owner tag, the output looks correct and lines up with the column headers:

    Host Name       Instance ID Private IP  Launch Time                 Instance State   Owner    AWS Account Name Account Number
    USAMZAPD1026    i-593c4fb4  10.1.232.26 2014-10-08T14:44:50.000Z    stopped          llindsay company-lab         123456789101

But if the owner tag doesn't exist the account name and account number end up in the wrong columns:

    Host Name      Instance ID  Private IP     Launch Time              Instance State    Owner     AWS Account Name Account Number
    USMDCP1028-AWS i-86533615   10.1.233.18   2016-11-03T15:01:52.000Z  stopped           company-lab  123456789101

This is the code I use to gather the info and dump it to the file:

echo "Host Name,Instance ID,Private IP,Launch Time, Instance State, Owner, AWS Account Name, Account Number" >> "$ofile"
readarray -t aws_instance_list < <(aws ec2 describe-instances | jq -r '.Reservations[].Instances[] | [(.Tags[]|select(.Key=="Name")|.Value), .InstanceId, .PrivateIpAddress, .LaunchTime, .State.Name, (.Tags[]|select(.Key=="Owner")|.Value)] | @csv'  | sed 's/"//g')
    for ((instance_index=0;instance_index<${#aws_instance_list[@]};++instance_index)); do
      instance_info="${aws_instance_list[$instance_index]}"
      echo "$instance_info"
      echo "$instance_info,$aws_account,$aws_account_number" >> "$ofile"
    done # Instance Loop
  done # Account Loop

Here is a sample of the output of that aws command that includes the variable output. Some lines have owners listed and some don't.

USAMZDBD1165,i-eb836cc6,10.1.232.165,2016-02-17T17:39:24.000Z,stopped,llindsay
USAMZAPD2058,i-3f5721d2,10.1.233.58,2017-04-03T18:10:37.000Z,running,nalkema
USAMZAPD2056,i-3e5721d3,10.1.233.56,2016-06-21T18:50:19.000Z,running,nalkema
USAMZAPD2057,i-315721dc,10.1.233.57,2015-05-28T20:02:55.000Z,running,nalkema
USAMZAPD1027,i-685cfd87,10.1.232.27,2015-02-11T20:22:08.000Z,stopped,llindsay
core-usawsnproddfw,i-2cedae9f,10.48.136.36,2017-03-17T15:37:52.000Z,running
UAWSCDAP0001,i-5e31c15f,10.48.131.176,2018-10-23T17:23:21.000Z,running,Eric Somebody
USMDPB1027-AWS,i-0be1611d,10.48.128.37,2016-11-11T16:08:14.000Z,stopped
usamzdbd2153,i-7e0d8b91,10.1.233.153,2015-02-19T16:57:57.000Z,running,tsenti

How can I take account for the variability of the "$instance_info" variable and have the columns line up correctly? Even if the owner tag is not there?

Upvotes: 1

Views: 197

Answers (1)

Ivo Yordanov
Ivo Yordanov

Reputation: 146

Without having any inputs or outputs to work with. And just with the sample above you could print to a file and then simply count the number of columns to determine what the header can be:

Attempt with counting columns:

readarray -t aws_instance_list < <(aws ec2 describe-instances | jq -r '.Reservations[].Instances[] | [(.Tags[]|select(.Key=="Name")|.Value), .InstanceId, .PrivateIpAddress, .LaunchTime, .State.Name, (.Tags[]|select(.Key=="Owner")|.Value)] | @csv'  | sed 's/"//g')
    for ((instance_index=0;instance_index<${#aws_instance_list[@]};++instance_index)); do
      instance_info="${aws_instance_list[$instance_index]}"
      echo "$instance_info"
      echo "$instance_info,$aws_account,$aws_account_number" > values.txt
      numlines=$(awk '-F "," {print NF}' values.txt)
      if [[ "$numlines" -eq 8 ]]; then
        echo "Host Name,Instance ID,Private IP,Launch Time, Instance State, Owner, AWS Account Name, Account Number" >> "$ofile"
      else
        echo "Host Name,Instance ID,Private IP,Launch Time, Instance State, AWS Account Name, Account Number" >> "$ofile"
      fi
    done # Instance Loop
  done

Now that we have more information I would go for another approach: Can you go for the following request?

readarray -t aws_instance_list < <(aws ec2 describe-instances | jq -r ' (.Tags[]|select(.Key=="Owner")|.Value)] | @csv'  | sed 's/"//g' | sort -u)

This way you would grab just the owner and sort by unique value. Try placing it in a file, where you can do this to place in an array (or place it directly in an array):

IFS=$'\r\n' GLOBIGNORE='*' command eval  'owners=($(<filename))'

At this point execute the code you had generating a file with the values without headers.

readarray -t aws_instance_list < <(aws ec2 describe-instances | jq -r '.Reservations[].Instances[] | [(.Tags[]|select(.Key=="Name")|.Value), .InstanceId, .PrivateIpAddress, .LaunchTime, .State.Name, (.Tags[]|select(.Key=="Owner")|.Value)] | @csv'  | sed 's/"//g')
    for ((instance_index=0;instance_index<${#aws_instance_list[@]};++instance_index)); do
      instance_info="${aws_instance_list[$instance_index]}"
      echo "$instance_info"
      echo "$instance_info,$aws_account,$aws_account_number" >> values.txt
    done # Instance Loop
  done # Account Loop 

Define two headers one with Owner and one without:

echo "Host Name,Instance ID,Private IP,Launch Time, Instance State, Owner, AWS Account Name, Account Number" >> "$ownerfile"
echo "Host Name,Instance ID,Private IP,Launch Time, Instance State, AWS Account Name, Account Number" >> "$nofile"

Once this is done you can go for two approaches:

Loop around the array of owners:

for (( a = 0; a < ${#owners[@]}; a++ )); do
grep -w ${owners[a]} values.txt >> "$ownerfile"
grep -v ${owners[a]} values.txt >> "$nofile"
done

Grab the values of the column owners and check in your own loop if the value is present in the owners array like this:

if [[ " ${owners[@]} " =~ " ${value} " ]]; then
    echo "$instance_info,$aws_account,$aws_account_number" >> "$ownerfile"
fi

if [[ ! " ${owners[@]} " =~ " ${value} " ]]; then
    echo "$instance_info,$aws_account,$aws_account_number" >> "$nofile"
fi

Someone more versed in awk could help you reduce further by directly checking the value of the column and printing what is necessary, but that is above me.

Here is an example of how the check of value of the column is done (if you know how to go on from here or someone else does):

awk '$7 = "owner" {print $0}'

BR

Upvotes: 1

Related Questions