Beevik
Beevik

Reputation: 570

substrings and the Go garbage collector

When taking a substring of a string in Go, no new memory is allocated. Instead, the underlying representation of the substring contains a Data pointer that is an offset of the original string's Data pointer.

This means that if I have a large string and wish to keep track of a small substring, the garbage collector will be unable to free any of the large string until I release all references to the shorter substring.

Slices have a similar problem, but you can get around it by making a copy of the subslice using copy(). I am unaware of any similar copy operation for strings. What is the idiomatic and fastest way to make a "copy" of a substring?

Upvotes: 7

Views: 1125

Answers (3)

Falco
Falco

Reputation: 3416

Since Go 1.17 (March 2022)

There is a function to create a copy of a string, which will copy the underlying data to a new location:

strings.Clone(s)

// Use this if you want to extract only a few substrings of a big string
subString := strings.Clone( bigString[5:10] )

Upvotes: 1

user2437417
user2437417

Reputation:

I know this is an old question, but there are a couple ways you can do this without creating two copies of the data you want.

First is to create the []byte of the substring, then simply coerce it to a string using unsafe.Pointer. This works because the header for a []byte is the same as that for a string, except that the []byte has an extra Cap field at the end, so it just gets truncated.

package main

import (
    "fmt"
    "unsafe"
)

func main() {
    str := "foobar"
    byt := []byte(str[3:])
    sub := *(*string)(unsafe.Pointer(&byt))
    fmt.Println(str, sub)
}

The second way is to use reflect.StringHeader and reflect.SliceHeader to do a more explicit header transfer.

package main

import (
    "fmt"
    "unsafe"
    "reflect"
)

func main() {
    str := "foobar"
    byt := []byte(str[3:])
    bytPtr := (*reflect.SliceHeader)(unsafe.Pointer(&byt)).Data
    strHdr := reflect.StringHeader{Data: bytPtr, Len: len(byt)}
    sub := *(*string)(unsafe.Pointer(&strHdr))
    fmt.Println(str, sub)
}

Upvotes: 0

peterSO
peterSO

Reputation: 166569

For example,

package main

import (
    "fmt"
    "unsafe"
)

type String struct {
    str *byte
    len int
}

func main() {
    str := "abc"
    substr := string([]byte(str[1:]))
    fmt.Println(str, substr)
    fmt.Println(*(*String)(unsafe.Pointer(&str)), *(*String)(unsafe.Pointer(&substr)))
}

Output:

abc bc
{0x4c0640 3} {0xc21000c940 2}

Upvotes: 1

Related Questions