metersk
metersk

Reputation: 12509

Naming columns in pandas deletes data

The code below gives me this table:

raw = pd.read_clipboard()
raw.head()

+---+---------------------+-------------+---------+----------+-------------+
|   |     Afghanistan     | South Asia  | 652225  | 26000000 | Unnamed: 4  |
+---+---------------------+-------------+---------+----------+-------------+
| 0 | Albania             | Europe      |   28728 |  3200000 | 6656000000  |
| 1 | Algeria             | Middle East | 2400000 | 32900000 | 75012000000 |
| 2 | Andorra             | Europe      |     468 |    64000 | NaN         |
| 3 | Angola              | Africa      | 1250000 | 14500000 | 14935000000 |
| 4 | Antigua and Barbuda | Americas    |     442 |    77000 | 770000000   |
+---+---------------------+-------------+---------+----------+-------------+

But when I attempt to rename the columns and create a DataFrame, all of the data disappears:

df = pd.DataFrame(raw, columns = ['name', 'region', 'area', 'population', 'gdp'])
df.head()

+---+------+--------+------+------------+-----+
|   | name | region | area | population | gdp |
+---+------+--------+------+------------+-----+
| 0 | NaN  | NaN    | NaN  | NaN        | NaN |
| 1 | NaN  | NaN    | NaN  | NaN        | NaN |
| 2 | NaN  | NaN    | NaN  | NaN        | NaN |
| 3 | NaN  | NaN    | NaN  | NaN        | NaN |
| 4 | NaN  | NaN    | NaN  | NaN        | NaN |
+---+------+--------+------+------------+-----+

Any idea why?

Upvotes: 0

Views: 60

Answers (1)

Nir Friedman
Nir Friedman

Reputation: 17704

You should just write:

df.columns = ['name', 'region', ...]

This is also much more efficient as you aren't trying to copy the entire DataFrame; as far as I know passing one DataFrame into the constructor for another will make a deep, not shallow copy.

Upvotes: 2

Related Questions