Gordian
Gordian

Reputation: 31

Convert string to boolean array

I need to convert a string that consists of a million 'zero' or 'one' characters (1039680 characters to be specific) to a boolean array. The way I have it now takes a few seconds for a 300000 character string and that is too long. I need to be able to do the whole milion character conversion in less than a second.

The way I tried to do it was to read a file with one line of (in this trial case) 300000 zeros.

I know my code will act funky for strings that contain stuff other than zeros or ones, but I know that the string will only contain those.

I also looked at atoi, but I don't think it would suit my needs.

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <stdbool.h>

#define BUFFERSIZE 1039680

int main ()
{
    int i ;
    char buffer[BUFFERSIZE];
    bool boolList[BUFFERSIZE] ;

    // READ FILE WITH A LOT OF ZEROS
    FILE *fptr;
    if ((fptr=fopen("300000zeros.txt","r"))==NULL){
        printf("Error! opening file\n");
        exit(1);
    }
    fscanf(fptr,"%[^\n]",buffer);
    fclose(fptr);

    // CONVERT STRING TO BOOLEAN ARRAY
    for (i=0 ; i<strlen(buffer) ; i++) {
        if (buffer[i] == '1') boolList[i] = 1 ;
    }

    return 0;
}

Upvotes: 3

Views: 1645

Answers (2)

phuclv
phuclv

Reputation: 41805

If the string length is always 1039680 characters like you said then why do you use strlen(buffer) in your code? Why don't just loop BUFFERSIZE times? And if the string length can be changed somehow then you should cache the length into a variable like others said instead of calling it again and again each loop.

More importantly you haven't included space for the NULL termination byte in the buffer, so when you read exact BUFFERSIZE characters, the char array is not a valid NULL terminated string, hence calling strlen on it invokes undefined behavior

If you want to read the file as text then you must add one more char to buffer

char buffer[BUFFERSIZE + 1];

Otherwise, open the file as binary and read the whole 1039680-byte block at once. That'll be much faster

fread(buffer, sizeof(buffer[0]), BUFFERSIZE, fptr);

And then just loop over BUFFERSIZE bytes and set it to 0 without a branch

for (i = 0 ; i < BUFFERSIZE; i++)
{
    buffer[i] -= '0';
}

You don't need another boolList, just use buffer as boolList or change the name to boolList and discard the buffer

Upvotes: 2

Darien Pardinas
Darien Pardinas

Reputation: 6186

Try

char *sptr = buffer;
bool *bptr = boolList;
while (*sptr != '\0')
    *bptr++ = *sptr++ == '1'? 1:0;

Upvotes: 4

Related Questions