Reputation: 49
I have a function that finds and prints the longest common chain between two DNA chains. However I want to add some checks so my program can ignore characters that are not bases ('A', 'T', 'C', 'G') For example CCAATTFFA and CCAATTKA have common: CCAATTA Here is my code:
void CommonSubStr(char *X, char *Y, long int m, long int n) {
long int maxCommonChain = 0;
long int end = 0;
for (long int i = 0; i < m; i++) {
for (long int j = 0; j < n; j++) {
long int currentLength = 0;
long int x = i, y = j;
while (x < m && y < n && X[x] == Y[y]) {
currentLength++;
x++;
y++;
}
if (currentLength > maxCommonChain) {
maxCommonChain = currentLength;
end = i + maxCommonChain - 1;
}
}
}
if (maxCommonChain == 0) {
printf("No common substring found.\n");
return;
}
long int start = end - maxCommonChain + 1;
for (long int i = start; i <= end; i++) {
if (X[i] == 'A' || X[i] == 'C' || X[i] == 'G' || X[i] == 'T') {
printf("%c", X[i]);
}
}
printf("\n");
}
Can anyone help me with the checks I should add? I've tried a lot of checks in the while
loop but none of them work.
Upvotes: 1
Views: 76
Reputation: 311126
If you are dealing with strings then the third and the fouth parameters of the function should be removed.
As passed strings are not changed within the function then the corresponding parameters shall be declared with qualifier const
.
The function should do only one thing: determine the common prefix. So it should return some result. The result can be for example pointers after the last equal characters of the common prefix in the both strings. As you just want to output the common prefix then the function could return a string that contains the common prefix.
Within your function the first nested three loops do not make sense. And in general your code is unclear. As any unclear code it has logical errors.
I can suggest the following function declaration and its definition as shown in the demonstration program below.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
char * CommonSubStr( const char *s1, const char *s2, const char *s3 )
{
size_t n = 0;
for (const char *p1 = s1, *p2 = s2;
( p1 = strpbrk( p1, s3 ) ) != NULL &&
( p2 = strpbrk( p2, s3 ) ) != NULL &&
*p1 == *p2;
++p1, ++p2)
{
++n;
}
char *common_prefix = calloc( n + 1, sizeof( char ) );
if (common_prefix != NULL)
{
char *current = common_prefix;
for (const char *p1 = s1; n--; )
{
p1 = strpbrk( p1, s3 );
*current++ = *p1++;
}
}
return common_prefix;
}
int main( void )
{
char *common_prefix = CommonSubStr( "CCAATTFFA", "CCAATTKA", "ATCG" );
if (common_prefix != NULL) printf( "\"%s\"\n", common_prefix );
free( common_prefix );
}
The program output is
"CCAATTA"
If the common prefix is empty the function returns an empty string.
Now having the above shown function it is easy to write a function that just outputs the common prefix of two strings in any stream.
Here you are.
#include <stdio.h>
#include <string.h>
FILE * CommonSubStr( const char *s1, const char *s2, const char *s3, FILE *fp )
{
while( ( s1 = strpbrk( s1, s3 ) ) != NULL &&
( s2 = strpbrk( s2, s3 ) ) != NULL &&
*s1 == *s2 )
{
fputc( *s1, fp );
++s1;
++s2;
}
return fp;
}
int main( void )
{
fputc( '\n', CommonSubStr( "CCAATTFFA", "CCAATTKA", "ATCG", stdout ) );
}
The program output is
CCAATTA
As you can see the function implementation is very simple. There is only one loop due to using the standard C string function strpbrk
.
If you want to output a message when a common substring is not found then just add an additional variable as for example
FILE * CommonSubStr( const char *s1, const char *s2, const char *s3, FILE *fp )
{
size_t n = 0;
while( ( s1 = strpbrk( s1, s3 ) ) != NULL &&
( s2 = strpbrk( s2, s3 ) ) != NULL &&
*s1 == *s2 )
{
++n;
fputc( *s1, fp );
++s1;
++s2;
}
if ( n == 0 ) fprintf( fp, "%s", "No common substring found." );
return fp;
}
Upvotes: 1