Reputation:
I need to create a function that intersects two vectors: vectorA ∩ vectorB
and allocates the values found in this intersection in a vectorC
!
Rules:
(a) The complexity of the function/method must be:
O(n), if n> m
O(m), if m> n
(b) The complexity of the program (function / main method) must be:
O(n log n), if n > m
O(m log m), if m > n
c) The signing of the function of intersection shall be obligatorily:
void intersections (char a [], int n, char b [], int m, char c [], int * k)
I was thinking of using an algorithm of complexity O(n) which is called "Linear Time". However this algorithm makes it "linear" comparison obviously as the name says.
Example:
A = { 'E', 'C', 'B', 'J', 'S', 'F', 'C', 'V', 'G' }
B = { 'G', 'C', 'M', 'W', 'L', 'O' }
C = { 'G', 'C' }
How far have I come?
Currently I can fetch and compare the corresponding values for the intersection.
What is my difficulty?
1) The values after the comparison can not be added to the vector
char c[]
with repetitions. How to prevent the same?2) How to use the pointer
int *k
(which is the size of the vectorchar c[]
) and allocate the corresponding values without repetitions to it?
bool checkHasStringEqual(char vectorA, char vectorB) {
string stringA, stringB;
stringA = toupper(vectorA),
stringB = toupper(vectorB);
size_t found = stringA.find(stringB);
return (found != std::string::npos);
}
void intersection(char a[], char b[], char c[], int n, int m, int *k){
int indexA = 0, indexB = 0, counterIntersection = 0;
if(n > m) {
while(indexA < n) {
if(indexB != m) {
if(checkHasStringEqual(a[indexA], b[indexB])) {
cout << "STRING A: " << a[indexA] << " == " << "STRING B: " << b[indexB] << endl;
counterIntersection++;
}
cout << "( " << a[indexA] << "-->" << b[indexB] << ")" << " -- " << "( " << indexA << ", " << indexB << ")" << endl;
(indexA == n -1) ? (indexA = 0, indexB++) : indexA++;
} else {
cout << "CAME TO THE END OF THE ANALYSIS" << endl;
break;
}
}
}
if(m > n) {
//TODO
}
}
Upvotes: 0
Views: 195
Reputation: 66204
The stated task is doable in linear time using a single lookup table, even without sorting, due to the significantly restricted domain size being managed. char
has at-most 1 << CHAR_BIT
distinct representations. On almost all systems that is simply 8-bits, or 256 possible values. A char16_t would use 16 bits, or 65536 possible representations, and of course, a char32_t, would have over 2-billion representations.
I'm assuming we're in the domain of the first one, or even two, of these, and not the last. Given that, this is doable with a single markup table indexed by all possible values within the domain. The algorithm is simple:
1 << CHAR_BIT
, all initialized to false
.unsigned char
equivalent as an index into the markup table, setting the table value to true
unsigned char
equivalent value as an index into he markup table. If the value is true
, clear it to false
and add that character to the output sequence.When finished, the result will be a unique-set-intersection of the two input character sequences. An example is given below. As it (and many other things) was not specified, I took liberty to make k
an in/out argument. On input it refers to the maximum number of chars that can fit in C[]
, On output it contains the number of chars actually stored.
#include <iostream>
#include <climits>
void mk_intersection(char a[], int n, char b[], int m, char c[], int * k)
{
bool tbl[1 << CHAR_BIT] = { 0 };
int o = *k;
*k = 0;
// markup a[]
for (int i = 0; i < n; ++i)
tbl[static_cast<unsigned char>(a[i])] = true;
// filter b[] from markup
for (int i = 0; i < m && *k < o; ++i)
{
if (tbl[static_cast<unsigned char>(b[i])])
{
tbl[static_cast<unsigned char>(b[i])] = false;
c[(*k)++] = b[i];
}
}
}
int main(int argc, char *argv[])
{
char A[] = { 'E', 'C', 'B', 'J', 'S', 'F', 'C', 'V', 'G' };
char B[] = { 'G', 'C', 'M', 'W', 'L', 'O' };
char C[std::max(sizeof A, sizeof B)] = { 0 };
int c_len = sizeof C;
mk_intersection(A, sizeof A, B, sizeof B, C, &c_len);
for (int i = 0; i < c_len; ++i)
std::cout.put(C[i]);
std::cout.put('\n');
return 0;
}
Output
GC
That's it. Regarding your question of "how to make main
O(n log n), frankly, that's nonsense. It implies that you can, at your discretion, presort the input before invoking your intersection operation. In doing so you can implement your function using a simple one-pass merge, which would work, and not have the domain-size limit I've described earlier. But for this example it is neither needed, nor warranted. It isn't required. And frankly, both arguments a[]
and b[]
could (and should) be const
.
Upvotes: 1
Reputation: 2318
If you don't worry about using too much memory, you can initialize 2 256-element integer arrays a_set
, b_set
which will be used to keep track of the occurrence of each character from 0-255 in array a
and b
, respectively. By using a_set
, you will add any character chr
from b
to c
if and only if a_set[chr]
is equal to 1 and b_set[chr]
is equal to 0. After adding chr
to c
, you will set b_set[chr]=1;
in order to avoid repetitions of characters.
The time complexity for this approach is O(n + m) --> O(max(n,m))
.
I also added the main
method below so you can check if it is your desired output for C
and k
.
#include <iostream>
using namespace std;
bool checkHasStringEqual(char vectorA, char vectorB) {
return toupper(vectorA) == toupper(vectorB);
}
void intersection(char a[], char b[], char c[], int n, int m, int *k){
int indexA = 0, indexB = 0, counterIntersection = 0;
unsigned char chr;
// unsigned char can only go from 0-255
int a_set[256] = {0}; // initialize array to 0
int b_set[256] = {0}; // initialize array to 0
*k = 0; // initialize k value to 0
for (int i = 0; i < n; ++i) {
chr = a[i];
a_set[chr] = 1;
}
for (int i = 0; i < m; ++i) {
chr = b[i];
if (a_set[chr] && !b_set[chr]) {
c[*k] = b[i];
(*k)++; // increase k index
}
b_set[chr] = 1; // mark character as inserted
}
}
int main() {
int n = 9;
int m = 6;
int k;
char A[n] = { 'E', 'C', 'B', 'J', 'S', 'F', 'C', 'V', 'G' };
char B[m] = { 'G', 'C', 'M', 'W', 'L', 'O' };
char C[max(n, m)]; // C will contain at most maximum of n, m elements
intersection(A, B, C, n, m, &k);
cout << "Intersection: " << endl;
for (int i = 0; i < k; ++i) {
cout << C[i] << endl;
}
return 0;
}
Upvotes: 0