Mohib
Mohib

Reputation: 43

Python Why do I have to re assign to variable

I am cleaning data. In the code below, I am using the str.title function to capitalize words. Then I check to see if they are empty, and if so I write something. But then I have to assign the row back to a variable, but I am a bit confused on that.

for row in moma:
    gender = row[5]

    #capitalize gender column
    gender = gender.title()

    #check to see if column is empty
    if not gender:
        gender = 'Gender Unknown/Other'
    row[5] = gender 

for row in moma:
    Nationality = row[2]

    Nationality = Nationality.title()

    if not Nationality:
        Nationality = 'Nationality Unknown'
    row[2] = Nationality

Example data:

['Duplicate of plate from folio 11 verso (supplementary suite, plate 4) from ARDICIA', 'Pablo Palazuelo', 'Spanish', '(1916)', '(2007)', 'Male', '1978', 'Prints & Illustrated Books']
['Tailpiece (page 55) from SAGESSE', 'Maurice Denis', 'French', '(1870)', '(1943)', 'Male', '1889-1911', 'Prints & Illustrated Books']

Upvotes: 0

Views: 97

Answers (3)

Raphael
Raphael

Reputation: 1801

python has mutable objects like lists or dicts and immutable objects like string or int. mutable objects are always assigned by reference. This means that changes to the copy affect the original value. On the other hand immutable objects are deeply copied when assigned to another variable and therefore chances only affect the copied version.

EDIT: I was wrong. Python never copies on assignments.

Assignment statements in Python do not copy objects, they create bindings between a target and an object. https://docs.python.org/3/library/copy.html

Upvotes: 0

sleblanc
sleblanc

Reputation: 3921

There are two parts to your question:

im cleaning data. In the code below, I am using the str.title function to capitalize words. Then I check to see if they are empty, and if so I write something. But then I have to assign the row back to a variable, but I am a bit confused on that.

Why do I have to re assign to variable

In your code, you write the following:

gender = gender.title()

Also the following:

if not gender:
    gender = 'Gender Unknown/Other'

The reason behind the pattern a = do something with a is that strings in Python are immutable, meaning that you cannot modify them. When you do gender = f(gender), you are assigning the result of f(gender) to the name "gender", erasing any previous definition that existed.

By the way, in Python, you assign values to names. The concept of "variable" barely comes up in the documentation; it's all about names.

And so, later on in the code, you write row[5] = gender. The reason you need to do this is also because strings are immutable: there is no way you can change the row without assigning something else to it.

Now, if the row was something else, such as an object, you could for instance do something like row[5].content = 'blah'.

Upvotes: 0

Mark Tolonen
Mark Tolonen

Reputation: 177471

You must reassign to the row because the value you generate is a separate object. You have to update the row with the object to affect the row.

The code you've provided looks like it works, but could be simplified. There is no need to iterate over rows twice, for example:

moma = [['w','x','male','y','z',''],
        ['w','x','','y','z','French']]

for row in moma:
    row[2] = row[2].title() if row[2] else 'Gender Unknown/Other'
    row[5] = row[5].title() if row[5] else 'Nationality Unknown'
    print(row)

Output:

['w', 'x', 'Male', 'y', 'z', 'Nationality Unknown']
['w', 'x', 'Gender Unknown/Other', 'y', 'z', 'French']

Upvotes: 1

Related Questions