Louis
Louis

Reputation: 43

Smalltalk / Squeak string shallow equality

The following code prints "false":

a := 'aaa'.
b := a deepCopy.
Transcript show: (a == b).

I do expect this behavior and my explanation to this would be that deepCopy returns a new object "b" that is a completely different object than "a" and since operator "==" compares by reference the result is "false". Is that correct?

However, I do not understand why the following code produces "true":

a := 'aaa'.
b := 'aaa'.
Transcript show: (a == b).

Here we made two assignments to two different objects, "a" and "b", and there shouldn't be any relation between them except the fact that they contain the same value. But if operator "==" compares by reference and not by value, why is the result of this comparison "true"?

Upvotes: 4

Views: 244

Answers (3)

Sean DeNigris
Sean DeNigris

Reputation: 6390

The same misconception in both cases is that the question is not "what happens?", but "what is guaranteed?". The key is that there is no guarantee that 'aaa' == 'aaa', but the compiler and VM are free to do things that way. The same seems true for the case of copying; since strings are immutable, I guess there's nothing to say that copying a string couldn't return the same object!

In your first example, as usual, the best teacher is the image. #deepCopy delegates to #shallowCopy, which at some point evaluates class basicNew: index, and copies the characters into the new object. So, this particular implementation will always create a new object.

Upvotes: 4

walid
walid

Reputation: 21

This is what I know from one of the free Smalltalk books scattered online but I can't find the reference:

As you would expect the instance of a class is a unique object in memory. deepCopy intentionally creates an object first and then stores a copy of the existing instance in it.

However numbers, characters and strings are treated as primitive data types by Smalltalk. When literal data, also referred to as literals, are assigned to variables they are first checked against a local scope dictionary which is invisible to the user and holds literals to check if they have been already added to it. If they haven't they will be added to the dictionary and the variable will point to the dictionary field. If identical literal data has been assigned before, the new variable will only point to the local scope dictionary field that contains the identical literal. This means that two or more variables assigned identical literals are pointing to the same dictionary field and therefore are identical objects. This is why the second comparison in your question is returning true.

Upvotes: 0

Tobias
Tobias

Reputation: 3110

In addition to what Sean DeNigris said, the reason why the comparison is true in the second case is that when you execute all three statements together, the compiler wants to be smart and only once creates the object for 'aaa' and shares them for a and b.

The same happens if you put this into one method *:

Object subclass: #MyClassA
    instanceVariableNames: ''
    classVariableNames: ''
    poolDictionaries: ''
    category: 'MyApp'


!MyClassA methodsFor: 'testing' stamp: nil prior: nil!
testStrings

    | a b |
    a := 'aaa'
    b := 'aaa'
    ^ a == b
! !
MyClassA testStrings " ==> true"

But this does not happen if they are in different methods:

Object subclass: #MyClassB
    instanceVariableNames: ''
    classVariableNames: ''
    poolDictionaries: ''
    category: 'MyApp'


!MyClassB methodsFor: 'testing' stamp: nil prior: nil!
a

    | a |
    a := 'aaa'
    ^ a
! !
!MyClassB methodsFor: 'testing' stamp: nil prior: nil!
b

    | b |
    b := 'aaa'
    ^ b
! !
!MyClassB methodsFor: 'testing' stamp: nil prior: nil!
testStrings

    ^ self a == self b
! !
MyClassB testStrings " ==> false"

That is because in Squeak, literal objects like stings are stored in the method object of the method they are defined in

*: Technically, every DoIt or PrintIt, that is when you just execute code by keystroke, gets compiled to one method in Squeak.

Upvotes: 2

Related Questions