Joel Deleep
Joel Deleep

Reputation: 1398

Create files matching a column value

I have a text file similar to below

https://hosted.example.com [403]
https://checkout.example.com [200]
https://lib.example.com [403]
http://autodiscover.example.com [301]
https://go.example.com [503]
https://qa.example.com [403]
https://qalib.example.com [403]
https://join.example.com [200]

The file has similar lines with status codes in the second column like the above sample data. I want to create files based on the second column, if the second column has a value of 200 then that line has to be redirected to 200.txt and this follows with every status code.

Script used:

list=('[403] [401] [404] [503] [301] [302] [200]')
for i in $list
do
filename=$(echo $i | cut -d "[" -f2 | cut -d "]" -f1)
awk '$2 == "$i"' file.txt | awk '{print $1}' > $filename.txt
done

The problem with the script is it create files but the value of column 1 is not redirected to the respective files.

Upvotes: 3

Views: 198

Answers (4)

RavinderSingh13
RavinderSingh13

Reputation: 133528

With your shown samples, could you please try following.

awk '
/^https?/ && match($0,/\[[0-9]+\]/){
  outputFile=substr($0,RSTART+1,RLENGTH-2)".txt"
}    
NF{
  print >> (outputFile)
  close(outputFile)
}
' Input_file

Upvotes: 1

Ed Morton
Ed Morton

Reputation: 203665

awk -F'[][]' '{close(out); out=$(NF-1)".txt"; sub(/ [^ ]+$/,""); print >> out}'

the above allows for [ and/or ] in the URL and won't fail with "too many open files" once you get past a limit with non-gawk awks.

With respect to '$2 == "$i"' in the script in your question - see How do I use shell variables in an awk script?.

Upvotes: 2

Jetchisel
Jetchisel

Reputation: 7791

This is the answer I have as far as I understood the question.

#!/usr/bin/env bash

while read -r col1 col2; do
  printf '%s %s\n' "$col1" "$col2" > "${col2//[][]}".txt
done < file.txt 

That will overwrite the contents of the existing file if the value inside [ ] is a match, or rather if there are duplicate values.

Upvotes: 1

anubhava
anubhava

Reputation: 785246

This awk should work for you:

awk '{gsub(/[][]/, "", $2); print $1 > ($2 ".txt")}' file

ls *.txt
200.txt  301.txt  403.txt  503.txt

Upvotes: 1

Related Questions