Reputation: 981
I want to create a function in C that gets the a substring from a string. This is what I have so far:
char* substr(char* src, int start, int len){
char* sub = malloc(sizeof(char)*(len+1));
memcpy(sub, &src[start], len);
sub[len] = '\0';
return sub;
}
int main(){
char* test = malloc(sizeof(char)*5); // the reason I don't use char* = "test"; is because I wouldn't be able to use free() on it then
strcpy(test, "test");
char* sub = substr(test, 1, 2); // save the substr in a new char*
free(test); // just wanted the substr from test
printf("%s\n", sub); // prints "es"
// ... free when done with sub
free(sub);
}
Is there any way I can save the substring into test
without having to create a new char*
? If I do test = substr(test, 1, 2)
, the old value of test
no longer has a pointer pointing to it, so it's leaked memory (I think. I'm a noob when it comes to C languages.)
Upvotes: 3
Views: 14165
Reputation: 154218
In C, string functions quickly run into memory management. So somehow the space for the sub-string needs to exist and passed to the function or the function can allocate it.
const char source[] = "Test";
size_t start, length;
char sub1[sizeof source];
substring1(source, sub1, start, length);
// or
char *sub2 = substring2(source, start, length);
...
free(sub2);
Code needs to specify what happens when 1) the start
index is greater than other original string's length and 2) the length
similarly exceeds the original string. These are 2 important steps not done OP's code.
void substring1(const char *source, char *dest, size_t start, size_t length) {
size_t source_len = strlen(source);
if (start > source_len) start = source_len;
if (start + length > source_len) length = source_len - start;
memmove(dest, &source[start], length);
dest[length] = 0;
}
char *substring2(const char *source, size_t start, size_t length) {
size_t source_len = strlen(source);
if (start > source_len) start = source_len;
if (start + length > source_len) length = source_len - start;
char *dest = malloc(length + 1);
if (dest == NULL) {
return NULL;
}
memcpy(dest, &source[start], length);
dest[length] = 0;
return dest;
}
By using memmove()
vs. memcpy()
in substring1()
, code could use the same destination buffer as the source buffer. memmove()
is well defined, even if buffers overlap.
substring1(source, source, start, length);
Upvotes: 0
Reputation: 81996
Let's break down what is being talked about:
test
to point to it.test
as well.You have 2 pieces of information that you claim you'd like to store in the same pointer. You can't do this!
Use two variables. I don't know why this isn't acceptable...
char *input = "hello";
char *output = substr(input, 2, 3);
Have your input parameter not be heap memory. There's a number of ways we could do this:
// Use a string literal
char *test = substr("test", 2, 2);
// Use a stack allocated string
char s[] = "test";
char *test = substr(s, 2, 2);
If you're already passing the length of the substring to the function, I'd personally rather see that function just get passed the piece of memory that it will push the data into. Something like:
char *substr(char *dst, char *src, size_t offset, size_t length) {
memcpy(dst, src + offset, length);
dst[length] = '\0';
return dst;
}
int main() {
char s[5] = "test";
char d[3] = "";
substr(d, s, 2, 2);
}
Upvotes: 0
Reputation: 84642
There are a number of ways to do this, and the way you approached it is a good one, but there are several areas where you seemed a bit confused. First, there is no need to allocated test
. Simply using a pointer is fine. You could simply do char *test = "test";
in your example. No need to free it then either.
Next, when you are beginning to allocate memory dynamically, you need to always check the return to make sure your allocation succeeded. Otherwise, you can easily segfault if you attempt to write to a memory location when there has been no memory allocated.
In your substr
, you should also validate the range of start
and len
you send to the function to insure you are not attempting to read past the end of the string.
When dealing with only positive numbers, it is better to use type size_t
or unsigned
. There will never be a negative start
or len
in your code, so size_t
fits the purpose nicely.
Lastly, it is good practice to always check that a pointer to a memory block to be freed actually holds a valid address to prevent freeing a block of memory twice, etc... (e.g. if (sub) free (sub);
)
Take a look at the following and let me know if you have questions. I changed the code to accept command line arguments from string
, start
and len
, so the use is:
./progname the_string_to_get_sub_from start len
I hope the following helps.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
char* substr (char* src, size_t start, size_t len)
{
/* validate indexes */
if (start + len > strlen (src)) {
fprintf (stderr, "%s() error: invalid substring index (start+len > length).\n", __func__);
return NULL;
}
char* sub = calloc (1, len + 1);
/* validate allocation */
if (!sub) {
fprintf (stderr, "%s() error: memory allocation failed.\n", __func__);
return NULL;
}
memcpy (sub, src + start, len);
// sub[len] = '\0'; /* by using calloc, sub is filled with 0 (null) */
return sub;
}
int main (int argc, char **argv) {
if (argc < 4 ) {
fprintf (stderr, "error: insufficient input, usage: %s string ss_start ss_length\n", argv[0]);
return 1;
}
char* test = argv[1]; /* no need to allocate test, a pointer is fine */
size_t ss_start = (size_t)atoi (argv[2]); /* convert start & length from */
size_t ss_lenght = (size_t)atoi (argv[3]); /* the command line arguments */
char* sub = substr (test, ss_start, ss_lenght);
if (sub) /* validate sub before use */
printf("\n sub: %s\n\n", sub);
if (sub) /* validate sub before free */
free(sub);
return 0;
}
Output
$ ./bin/str_substr test 1 2
sub: es
If you choose an invalid start / len combination:
$ ./bin/str_substr test 1 4
substr() error: invalid substring index (start+len > length).
Verify All Memory Freed
$ valgrind ./bin/str_substr test 1 2
==13515== Memcheck, a memory error detector
==13515== Copyright (C) 2002-2012, and GNU GPL'd, by Julian Seward et al.
==13515== Using Valgrind-3.8.1 and LibVEX; rerun with -h for copyright info
==13515== Command: ./bin/str_substr test 1 2
==13515==
sub: es
==13515==
==13515== HEAP SUMMARY:
==13515== in use at exit: 0 bytes in 0 blocks
==13515== total heap usage: 1 allocs, 1 frees, 4 bytes allocated
==13515==
==13515== All heap blocks were freed -- no leaks are possible
==13515==
==13515== For counts of detected and suppressed errors, rerun with: -v
==13515== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 2 from 2)
Upvotes: 0
Reputation: 1551
Well you could always keep the address of the malloc'd memory is a separate pointer:
char* test = malloc(~~~)
char* toFree = test;
test = substr(test,1,2);
free(toFree);
But most of the features and capabilities of shuffling this sort of data around has already been done in string.h. One of those functions probably does the job you want get done. movemem()
as others have pointed out, could move the substring to the start of your char pointer, viola!
If you specifically want to make a new dynamic string to play with while keeping the original separate and safe, and also want to be able to overlap these pointers.... that's tricky. You could probably do it if you passed in the source and destination and then range-checked the affected memory, and free'd the source if there was overlap... but that seems a little over-complicated.
I'm also loathe to malloc memory that I trust higher levels to free, but that's probably just me.
As an aside,
char* test = "test";
Is one of those niche cases in C. When you initialize a pointer to a string literal (stuff in quotes), it puts the data in a special section of memory just for text data. You can (rarely) edit it, but you shouldn't, and it can't grow.
Upvotes: 0
Reputation: 11
void substr(char* str, char* sub , int start, int len){
memcpy(sub, &str[start], len);
sub[len] = '\0';
}
int main(void)
{
char *test = (char*)malloc(sizeof(char)*5);
char *sub = (char*)malloc(sizeof(char)*3);
strcpy(test, "test");
substr(test, sub, 1, 2);
printf("%s\n", sub); // prints "es"
free(test);
free(sub);
return 0;
}
Upvotes: 1