Reputation: 59
Well, I have about 114 files that I want to join side-by-side based on the 1st column that each file shares, which's the ID number. Each file consists of 2 columns and over 400000 lines. I used write.table
to join those tables together in one table and I got X's in my header. For example, my header should be like:
ID 1_sample1 2_sample2 3_sample3
But I get it like this:
ID X1_sample1 X2_sample2 X3_sample3
I read about this problem and found out the check.names
get rid of this problem, but in my case when I use check.names
I get the following error:
"unused argument (check.name = F)"
Thus, I decided to use sed to fix the problem, it actually works great, BUT it joins the 2nd line and the 1st line. For instance, my 1st column and second column should be something like this:
ID 1_sample1 2_sample2 3_sample
cg123 .0235 2.156 -5.546
But I get the following instead:
ID 1_sample1 2_sample2 3_sample cg123 .0235 2.156 -5.546
Can any one check this code for me, please. I might've done something wrong to not get each line separated from the other.
head -n 1 inFILE | tr "\t" "\n" | sed -e 's/^X//g' | sed -e 's/\./-/' | sed -e 's/\./(/' |sed -e 's/\./)/' | tr "\n" "\t" > outFILE
tail -n +2 beta.norm.txt >> outFILE
Upvotes: 1
Views: 2917
Reputation: 46415
If your data is tab delimited, the simple fix would be
sed '1,1s/\tX/\t/g' < inputfile > outputfile
1,1 only operate on the range "line 1 to line 1"
\tX find tab followed by X
/\t/ replace with tab
g all occurrences
It does seem as though your original attempt does more than just strip the X - it also changes successive dots to (-)
but you don't show in your example why you need that. The reason your code joins the first two lines is that you only replace \n
with \t
in your last tr
command - which leaves you with no \n
at the end of the line.
You need to attach a \n
at the end of your first line before concatenating lines 2 and beyond with your second command. Experiment with
head -n 1 inFILE | tr "\t" "\n" | sed -e 's/^X//g' | sed -e 's/\./-/' | sed -e 's/\./(/' |sed -e 's/\./)/' | tr "\n" "\t" > outFILE
echo "\n" >> outFile
tail -n +2 beta.norm.txt >> outFILE
whether that works depends on your OS. There are other ways to add a newline...
edit using awk
is probably much cleaner - for example
awk '(NR==1){gsub(" X"," ", $0);}{print;}' inputFile > outputFile
Explanation:
(NR==1) for the first line only (record number == 1) do:
{gsub(" X","", $0);} do a global substitution of "space followed by X", with "space"
for all lines (including the one that was just modified) do:
{print;}' print the whole line
Upvotes: 2