NoSleepTonight
NoSleepTonight

Reputation: 51

How do I convert a long string into a smaller abbreviation consisting of the first character, last character and number of chars in between?

So I'm preparing for my exams by analysing past questions and I fell upon this question:

c. Sometimes some words like "structured programming language" or "computer science and engineering" are so long that written them many times in one text is quite tiresome. Let's consider a word too long if it's length is strictly more than 10 characters. All too long words should be replaced with a special abbreviation. This abbreviation is made like this:
(i) The first and the last letter of a word
(ii) The number of letters (including spaces) between the first and the last letters.
Thus "structured programming language" will be spelled as "s29c" and "computer science and engineering" will be spelled as "c30g", otherwise the actual word will be printed. Now, construct a C program to implement the above scenario. The word length should not be more than 100. You can't use the built-in string functions.

It says to take a long string of less than 100 characters and convert it so that the final abbreviated form has its first character as the first character of the original string, the number of characters (including spaces) in between the first and last character and then end it off with the last character. We aren't allowed to use the built-in string functions.

My skill is still pretty beginner unfortunately, so my code might be far from efficient at the moment, but here's what I have so far:

#include <stdio.h>

int main()
{
    int len, n, i;
    char str[100], abb[5];
    char ws, we;

    gets(str);
    for (i = 0; str[i] != '\0'; i++)
    {
        len = i + 1;
    }
   
    if (len > 10)
    {
        ws = str[0];
        abb[0] = ws;
        for (i = 1; str[i] != '\0'; i++)
        {
            n = i - 1;
            we = str[i];
        }
    }
    else
        puts(str);
}

I've managed to find the first and last characters as well as the number of characters between the two, but I'm not sure how to place that into another string. I just did a slapstick abb[0] = ws for the first character, but I don't know how to move forward from there.

Upvotes: 2

Views: 166

Answers (3)

chqrlie
chqrlie

Reputation: 145277

There are problems in your code:

  • the assignment says The word length should not be more than 100, which means a length of 100 is accepted. You should make the array one byte longer for the null terminator.

  • you must never use gets(): this function is inherently unsafe as any sufficiently long input will cause a buffer overflow and many times can be exploited by attackers. It has been removed from the C Standard. You should use fgets() instead and make the array one byte longer to accommodate for the trailing newline and remove that after input.

  • you are not allowed to use the string functions from the C library, but you can write your own function to compute the length of the string and use that in your program.

  • The way you compute the length of the string is not foolproof: you only modify len if the string is not empty. If the string is empty, len is left uninitialized, so the rest of the code will have undefined behavior.

  • to convert a number into its representation as a string in base 10, you should divide by 10 to get the digits and add '0' to convert a digit into the corresponding character in the character set. It is tricky for the general case, but for a positive number below 10, a single digit is easy to produce and for a number len between 10 and 99, the first character is obtained by dividing len by 10 and the second by taking the modulo len % 10. Given that the number to convert is in the range 9 to 98, you can use a test to determine which method to use.

  • the abbreviation should be constructed in the abb array, and a null terminator set at the end of the string.

Here is a modified version:

#include <stdio.h>

size_t my_strlen(const char *s) {
    size_t len = 0;
    while (s[len] != '\0')
        len++;
    return len;
}

int main(void)
{
    char str[102], abb[5];
    size_t len;

    if (!fgets(str, sizeof str, stdin)) {
        printf("missing input\n");
        return 1;
    }
    len = my_strlen(str);
    /* strip the trailing newline if present */
    if (len > 0 && str[len - 1] == '\0')
        str[--len] = '\0';
   
    if (len > 10) {
        size_t num = len - 2;
        size_t i = 0;
        abb[i++] = str[0];
        if (num < 10) {
            abb[i++] = '0' + num;
        } else {
            abb[i++] = '0' + num / 10;
            abb[i++] = '0' + num % 10;
        }
        abb[i++] = str[len - 1];
        abb[i] = '\0';
        puts(abb);
    } else {
        puts(str);
    }
    return 0;
}

Upvotes: 3

Vlad from Moscow
Vlad from Moscow

Reputation: 311088

Your code does not make a great sense. Moreover it can invoke undefined behavior.

For example the variable len is uninitialized and has an indeterminate value.

The function gets is unsafe and is not supported by the C Standard.

Instead you could use standard function scanf as for example

char str[101];
//       ^^^
scanf( "%100[^\n]", str );

And the program does not even try to output an abbrevated string.

It seems you need to change a string in place. It will be reasonable to write a separate function that processes passed to it strings.

Here is a demonstration program that shows how the function can be written without using any standard C function.

#include <string.h>

char * abbrevated( char *s )
{
    const size_t Base = 10;
    const size_t MIN = 10;

    size_t n = 0;

    while (s[n]) ++n;

    if (MIN < n)
    {
        size_t letter_count = n - 2;
        size_t width = 0;

        for (size_t tmp = letter_count; tmp != 0; tmp /= Base)
        {
            ++width;
        }

        s[width + 1] = s[n - 1];
        s[width + 2] = '\0';
            
        for ( ; width != 0; --width )
        {
            s[width] = letter_count % Base + '0';
            letter_count /= Base;
        }
    }

    return s;
}

int main( void )
{
    char s1[] = "structured programming language";
    char s2[] = "computer science and engineering";
    char s3[] = "a question";

    puts( s1 );
    puts( abbrevated( s1 ) );
    putchar( '\n' );

    puts( s2 );
    puts( abbrevated( s2 ) );
    putchar( '\n' );

    puts( s3 );
    puts( abbrevated( s3 ) );
    putchar( '\n' );
}

The program output is

structured programming language
s29e

computer science and engineering
c30g

a question
a question

Upvotes: 1

seb
seb

Reputation: 147

First, as pointed out in the comment by @Some programmer dude, the function gets() should not be used. This is because of how it is implemented; if the last character of the "string" argument is not \0 ( the null character ), gets() can try and access part of the memory of your computer it was not supposed to, yielding the risk of a Buffer overflow attack. What fgets() does better is just asking you for a bound on the memory space it will scan, in your case that would be 100.
To print things to the console in C, as in many other languages, a very handy tool is formatted string. It's basically telling the computer of what you want to print and in which format instead of just giving him some memory space to copy to the output.
In your case, you know in advance the output will be some char, followed by some int and then some char again.
The first char is, as you said, the first char of the input string : ws = str[0]
You also know that the last char of the input string is located len - 1 char after the first one : we = str[len-1] You finally now that the length of the in between string is len - 2 .
To print it correctly, as you can find in the printf documentation, you can finally use

printf("%c%d%c", ws, length - 2, ws)

Where %c stands for printing a single char and %d stands for printing a integer in base 10 (d is for decimal).

Upvotes: 1

Related Questions