Reputation: 263
I have a list in a file named Target_id_convert.txt
70S ribosome
ALK tyrosine kinase receptor
ATP
ATP synthase
Desired output
('70S ribosome','ALK tyrosine kinase receptor','ATP','ATP synthase')
I have written this code
sed -e "s/'/'\\\\''/g;s/\(.*\)/'\1'/" Target_id_convert.txt > Target_id_convert1.txt
tr '\n' ',' < Target_id_convert1.txt > Target_id_convert_output.txt
I then have to manually edit the file and add () in the Target_id_convert_output.txt file, Kindly let me know how to do it efficiently and all in one go, as It is all supposed to be automated.
Upvotes: 2
Views: 123
Reputation: 37394
In awk:
$ awk 'BEGIN{q="\047";RS="";FS="\n";OFS=q","q}{$0="("q $0 "\)"q;$1=$1}1' file
Output for single list file:
('70S ribosome','ALK tyrosine kinase receptor','ATP','ATP synthase')
Explained:
awk '
BEGIN {
q="\047" # define q to - well, \047
RS="" # see below (*
FS="\n" # newline is input field separator
OFS=q","q # output field separator to ,
}
{
$0="(" q $0 "\)" q # surround record with single quotes
$1=$1 # rebuild the record
} 1' file # print
*) From the GNU awk documentation: By a special dispensation, an empty string as the value of RS indicates that records are separated by one or more blank lines. When RS is set to the empty string, each record always ends at the first blank line encountered. The next record doesn’t start until the first nonblank line that follows. This allow empty-line separated lists to be processed. For example, using @Thor's sample data, output would be:
('70S ribosome','ALK tyrosine kinase receptor','ATP','ATP synthase)'
('70S ribosome','ALK tyrosine kinase receptor','ATP','ATP synthase)'
Upvotes: 1
Reputation: 203229
Just set your field and record separators, recompile the record and print:
$ awk -v RS= -v s="('" -v ORS="')\n" -F'\n' -v OFS="','" '{$1=s$1}1' file
('70S ribosome','ALK tyrosine kinase receptor','ATP','ATP synthase')
Upvotes: 1
Reputation: 47099
Assuming your records are double new-line separated, I would go with a sed
/awk
combo:
<file sed "/[^[:blank:]]/ s/.*/'&'/g" |
awk '{ $1=$1; print "(" $0 ")" }' RS= FS='\n' OFS=,
If the input is:
70S ribosome
ALK tyrosine kinase receptor
ATP
ATP synthase
70S ribosome
ALK tyrosine kinase receptor
ATP
ATP synthase
Output is:
('70S ribosome','ALK tyrosine kinase receptor','ATP','ATP synthase')
('70S ribosome','ALK tyrosine kinase receptor','ATP','ATP synthase')
Upvotes: 3
Reputation: 437208
To offer an alternative that uses trl
, a utility of mine for transforming text between single- and multi-line forms:
$ trl -S, -D\' -W'()' <<<$'70S ribosome\nALK tyrosine kinase receptor\nATP\nATP synthase'
('70S ribosome','ALK tyrosine kinase receptor','ATP','ATP synthase')
-S,
sets the output separator to ,
(what to place between items)-D\'
sets the output item delimiter to '
(what to enclose each item in)-W'()'
wraps (encloses) the resulting output line in (
and )
.trl
from the npm registry (Linux and macOS)Note: Even if you don't use Node.js, its package manager, npm
, works across platforms and is easy to install; try
curl -L https://git.io/n-install | bash
With Node.js installed, install as follows:
[sudo] npm install trl -g
Note:
sudo
depends on how you installed Node.js and whether you've changed permissions later; if you get an EACCES
error, try again with sudo
.-g
ensures global installation and is needed to put trl
in your system's $PATH
.bash
)bash
script as trl
.chmod +x trl
.$PATH
, such as /usr/local/bin
(OSX) or /usr/bin
(Linux).Upvotes: 2
Reputation: 133458
try:
awk -v s1="'" -v s2="'," -v s3="(" -v s4=")" 'NR==1{printf("%s",s3)} last{printf("%s",s1 last s2)} {last=$0} END{printf("%s\n",last s1 s4)}' Input_file
I am defining the variables like s1, s2,s3 and s4 with their values. Then I am printing ( on very first line and then taking line's values into variable named last and printing the lines values with value', in END section of code printing the line's value with ') too. I am considering your Input_file is having same values as shown sample Input_file.
Upvotes: 1
Reputation: 195039
This awk one-liner should do what you want:
awk -v q="'" '{$0=q $0 q;printf "%s%s", (NR==1?"(":","),$0}END{print ")"}' file
I declared a var q
to have single quote ('
), to avoid many escaping.
Upvotes: 5
Reputation: 3137
Try this -
$ cat f
70S ribosome
ALK tyrosine kinase receptor
ATP
ATP synthase
$ awk -v line=$(wc -l < f) -v ORS="" 'BEGIN{printf "("} {if(NR < line) {print a$0b}} END {print a$0a")\n"}' b="'," a="'" f
('70S ribosome','ALK tyrosine kinase receptor','ATP','ATP synthase')
Upvotes: 1
Reputation: 5768
$ cat f.awk
BEGIN {
sep = ""
b = "'"
}
{
ans = ans sep b $0 b
sep = ","
}
END { print "(" ans ")" }
Usage:
awk -f f.awk file
Upvotes: 2