Reputation: 103
I need to compare two .txt files with the following formats , with R:
rows in file1:
1-11!AIVDM,1,1,,B,11b4N?@P?w<tSF0l4Q@>4?wp1`Oo,0*3D
1347204643
2-12$GPRMC,153102,A,6300.774,N,05238.627,W,12.9,186,090912,30,W*79
1347204664
( here for some reason the time (1347204643) is in the separate row)
rows in file2:
#1:1347204643:11!AIVDM,1,1,,B,11b4N?@P?w<tSF0l4Q@>4?wp1`Oo,0*3D
#2:1347204664:12$GPRMC,153102,A,6300.774,N,05238.627,W,12.9,186,090912,30,W*79
I am interested only in verifying if the same ID, which is in the beginning of the row (e.g. 1 and 2 here), exists in both files ( if the ID that exists in file1 exists also in file2).
Can someone help me with this? Thank you very much in advance!
Upvotes: 0
Views: 108
Reputation: 121568
You can do something like this :
First you read 2 two files using readLines
ll1 <- readLines(textConnection('#1:1347204643:11!AIVDM,1,1,,B,11b4N?@P?w<tSF0l4Q@>4?wp1`Oo,0*3D
#2:1347204664:12$GPRMC,153102,A,6300.774,N,05238.627,W,12.9,186,090912,30,W*79'))
ll2 <- readLines(textConnection('1-11!AIVDM,1,1,,B,11b4N?@P?w<tSF0l4Q@>4?wp1`Oo,0*3D
1347204643
2-12$GPRMC,153102,A,6300.774,N,05238.627,W,12.9,186,090912,30,W*79
1347204664'))
Do some treatments
#Remove '#` fom the first files
ll1 <- gsub('#','',ll1)
#Take only the odd lines from the second file
ll2 <- ll2[c(TRUE,FALSE)]
Extract the index of each lines using substr
ll1 <- substr(ll1,1,1)
ll2 <- substr(ll2,1,1)
Now you have this 2 lists :
ll1
[1] "1" "2"
> ll2
[1] "1" "2
To compare you can use match
match(ll1,ll2)
[1] 1 2
Upvotes: 1