fakis
fakis

Reputation: 11

Encryption with substitution cipher does not generate valid ASCII output

I cant figure out why it says "output not valid ASCII text"!

In order for you to understand the context of the problem i post what is described to do. IT STARTS HERE!

In a substitution cipher, we “encrypt” (i.e., conceal in a reversible way) a message by replacing every letter with another letter. To do so, we use a key: in this case, a mapping of each of the letters of the alphabet to the letter it should correspond to when we encrypt it. To “decrypt” the message, the receiver of the message would need to know the key, so that they can reverse the process: translating the encrypt text (generally called ciphertext) back into the original message (generally called plaintext).

A key, for example, might be the string NQXPOMAFTRHLZGECYJIUWSKDVB. This 26-character key means that A (the first letter of the alphabet) should be converted into N (the first character of the key), B (the second letter of the alphabet) should be converted into Q (the second character of the key), and so forth.

A message like HELLO, then, would be encrypted as FOLLE, replacing each of the letters according to the mapping determined by the key.

Let’s write a program called substitution that enables you to encrypt messages using a substitution cipher. At the time the user executes the program, they should decide, by providing a command-line argument, on what the key should be in the secret message they’ll provide at runtime.

Here are a few examples of how the program might work. For example, if the user inputs a key of YTNSHKVEFXRBAUQZCLWDMIPGJO and a plaintext of HELLO:

$ ./substitution YTNSHKVEFXRBAUQZCLWDMIPGJO
plaintext:  HELLO
ciphertext: EHBBQ

Here’s how the program might work if the user provides a key of VCHPRZGJNTLSKFBDQWAXEUYMOI and a plaintext of hello, world:

$ ./substitution VCHPRZGJNTLSKFBDQWAXEUYMOI
plaintext:  hello, world
ciphertext: jrssb, ybwsp

Notice that neither the comma nor the space were substituted by the cipher. Only substitute alphabetical characters! Notice, too, that the case of the original message has been preserved. Lowercase letters remain lowercase, and uppercase letters remain uppercase.

Whether the characters in the key itself are uppercase or lowercase doesn’t matter. A key of VCHPRZGJNTLSKFBDQWAXEUYMOI is functionally identical to a key of vchprzgjntlskfbdqwaxeuymoi (as is, for that matter, VcHpRzGjNtLsKfBdQwAxEuYmOi).

#include <cs50.h>
#include <stdio.h>
#include <string.h>

int main(int argc, string argv[])
{
    if (argc != 2)
    {
        printf("Usage: ./substitution KEY");
        return 1;               
    }
    else if (argc == 2)
    {
        string text = argv[1];
        string storing = text;
        int counter = 0;
        int i = 0;
        bool number = true;
        bool flag = true;
        while (flag == true && i < 26)
        {
            if ((int)text[i] >= 48 && (int)text[i] <= 57)
            {
                 number = false;
            }
            if (((int)text[i] >= 65 && (int)text[i] <= 90) || ((int)text[i] >= 97 && (int)text[i] <= 122))
            {
                counter++;
                for ( int j = 0; j < counter - 1; j++)
                {
                    if ((int)storing[j] == (int)text[i] || (int)storing[j] + 32 == (int)text[i])
                    {
                        flag = false;
                    }
                }
            }
            i++;
        }
        if (number == false)
        {
            printf("Key must only contain alphabetic characters.");
            return 1;
        }
        if (flag == false)
        {
            printf("Key must not contain repeated characters.");
            return 1;
        }
        if (counter < 26)
        {
            printf("Key must contain 26 characters.");
            return 1;
        }
    }
    string plaintext = get_string("plaintext:");
    string key = argv[1];
    int counter;
    bool not_letter;
    bool capital1;
    bool capital;
    int crypto[strlen(plaintext)];
    for (int i = 0; i < strlen(plaintext); i++)
    {
        capital1 = false;
        capital = false;
        not_letter = false;
        if ((int)plaintext[i] >=65 && (int)plaintext[i] <= 90)
        {           
            counter =  (int)plaintext[i] - 65;            
            capital = true;
        }
        else if ((int)plaintext[i] >=97 && (int)plaintext[i] <= 122)
        {
            counter =  (int)plaintext[i] - 97;                          
            capital1 = true;
        }
        else
        {
            not_letter = true;
        }
        if (not_letter == true)
        {
            crypto[i] = (int)plaintext[i];
        }
        else if (capital == true)
        {
            if ((int)key[i] >=65 && (int)key[i] <= 90)
            {
                crypto[i] = (int)key[counter];
            }
            else if ((int)key[i] >=97 && (int)key[i] <= 122)
            {
                crypto[i] = (int)key[counter] - 32;
            }
        }
        else if (capital1 == true)
        {
            if ((int)key[i] >=65 && (int)key[i] <= 90)
            {
                crypto[i] = (int)key[counter] + 32;
            }
            else if ((int)key[i] >=97 && (int)key[i] <= 122)
            {
                crypto[i] = (int)key[counter];
            }  
        }
    }
    printf("ciphertext: ");
    for (int i = 0; i < strlen(plaintext); i++)
    {
        printf("%c", (char)crypto[i]);
    }
    printf("\n");
    return 0;
}

When i test the programm with check50 cs50/problems/2020/x/substitution of CS50 it says:

:) substitution.c exists
:) substitution.c compiles
:) encrypts "A" as "Z" using ZYXWVUTSRQPONMLKJIHGFEDCBA as key
:) encrypts "a" as "z" using ZYXWVUTSRQPONMLKJIHGFEDCBA as key
:) encrypts "ABC" as "NJQ" using NJQSUYBRXMOPFTHZVAWCGILKED as key
:) encrypts "XyZ" as "KeD" using NJQSUYBRXMOPFTHZVAWCGILKED as key
:) encrypts "This is CS50" as "Cbah ah KH50" using YUKFRNLBAVMWZTEOGXHCIPJSQD as key
:) encrypts "This is CS50" as "Cbah ah KH50" using yukfrnlbavmwzteogxhcipjsqd as key
! :( encrypts "This is CS50" as "Cbah ah KH50" using YUKFRNLBAVMWZteogxhcipjsqd as key
    output not valid ASCII text
! :( encrypts all alphabetic characters using DWUSXNPQKEGCZFJBTLYROHIAVM as key
    output not valid ASCII text
:) handles lack of key
:) handles invalid key length
:) handles invalid characters in key
:) handles duplicate characters in key
:) handles multiple duplicate characters in key

Upvotes: 0

Views: 2075

Answers (1)

Maarten Bodewes
Maarten Bodewes

Reputation: 94058

Let's assume you've converted your key to uppercase.

Then let's simplify the heck out of your loop:

int keyIndex;
bool lowercase;
for (int i = 0; i < strlen(plaintext); i++) {
    if (plaintext[i] >= 'A' && plaintext[i] <= 'Z') {
        keyIndex = plaintext[i] - 'A';
        lowercase = false;
    } else if (plaintext[i] >= 'a' && plaintext[i] <= 'z') {
        keyIndex = plaintext[i] - 'a';
        lowercase = true;
    } else {
        // do not encrypt that character
        ciphertext[i] = plaintext[i];
        // and skip the rest of the loop
        continue;
    }

    ciphertext[i] = key[keyIndex];

    // revert back to lowercase, if necessary
    if (lowercase) {
        ciphertext[i] += 'a' - 'A';
    }
}

The problem you've run into is that you didn't split up your big problem into small problems. So if you e.g. create an all uppercase key in advance, then you don't need to take it into account later on.

It's also very important to keep symmetry in your application. So if you assign a value in one part of the if then also do it in the else. And if you're able to stop early (such as when a character needs to be kept), then please do so.

It would of course be equally valid to convert the key to all lowercase and then convert characters back to uppercase where necessary.

Never ever use variable names such as counter or uppercase1. It makes your code very hard to read; variable names should be as clear as possible.

The ciphertext array is just a string / character array in the code above. If you're using C rather than C++ then you'd normally use a char* or char[], I presume.

Of course, this is still without using any separation between functions. It also doesn't use any platform functionality. If you manage to do that then debugging / maintaining your application will become even easier.

Upvotes: 0

Related Questions