Manu
Manu

Reputation: 5784

Combine two files in linux without repetition

I have two files file1 and file2

Contents of file1 is

Hello
  how
are you
when can i meet you
film??

Contents of file2 is

Hello 
how 
are you
darling
when can i meet you

I want to generate a file which is a combination of two file like

Hello
how
are you
darling
when can i meet you
film??

Note: Space in the second line of file1 should be ignored in the final file is there any inbuilt function in C or Linux to do the above following job or can a script be written to do this?

Upvotes: 6

Views: 271

Answers (3)

Steve
Steve

Reputation: 54392

Here's one way using awk:

awk '{ gsub(/^[ \t]+|[ \t]+$/,"") } !a[$0]++' file2 file1

Results:

Hello
how
are you
darling
when can i meet you
film??

EDIT:

The problem with:

awk '{ $1=$1 } !a[$0]++' file2 file1

Is that, although it works well for this simple example, it will treat similar lines as the same thing because it not only removes leading and lagging whitespace, but it will also remove extra whitespace between fields. For example, if file1 contains:

Hello
  how
are you
when  can i meet you
film??

Both the:

when can i meet you

and:

when  can i meet you

lines would be treated as the same thing. This may be the desired result, but based on your question, I think it's best to simply strip leading and lagging whitespace as per the first command. HTH.

Upvotes: 1

Chris Seymour
Chris Seymour

Reputation: 85775

Easy job for awk:

$ awk '{$1=$1}!u[$0]++' file2 file1
Hello
how
are you
darling
when can i meet you
film??

Or if you don't care about the order of the output:

$ sed 's/^\s*//' file1 file2 | sort -u 
are you
darling
film??
Hello
how
when can i meet you

Upvotes: 4

anumi
anumi

Reputation: 3947

You can apply several standard filters:

cat file1 file2 | perl -pe 's/^\s+//' | sort | uniq
  • cat is used to concatenate all the required files,
  • perl is udes to remove all initial whote space,
  • sort sorts all lines,
  • and uniq removes the duplicate lines.

Upvotes: 1

Related Questions