Reputation: 9
So apparently I am trying to declare an empty dataframe, then assign some values in it
df = pd.DataFrame()
df["a"] = 1234
df["b"] = b # Already defined earlier
df["c"] = c # Already defined earlier
df["t"] = df["b"]/df["c"]
I am getting the below output:
Empty DataFrame
Columns: [a, b, c, t]
Index: []
Can anyone explain why I am getting this empty dataframe even when I am assigning the values. Sorry if my question is kind of basic
Upvotes: 0
Views: 133
Reputation: 1340
The problem is that a cell in a table needs both a row index value and a column index value to insert the cell value. So you need to decide if "a", "b", "c" and "t" are columns or row indexes.
If they are column indexes, then you'd need a row index (0 in the example below) along with what you have written above:
df = pd.DataFrame()
df.loc[0, "a"] = 1234
df.loc[0, "b"] = 2
df.loc[0, "c"] = 3
Result:
In : df
Out:
a b c
0 1234.0 2.0 3.0
Now that you have data in the dataframe you can perform column operations (i.e., create a new column "t" and for each row assign the value of the corresponding item under "b" divided by the corresponding items under "c"):
df["t"] = df["b"]/df["c"]
Of course, you can also use different indexes for each item as follows:
df = pd.DataFrame()
df.loc[0, "a"] = 1234
df.loc[1, "b"] = 2
df.loc[2, "c"] = 3
Result:
In : df
Out:
a b c
0 1234.0 NaN NaN
1 NaN 2.0 NaN
2 NaN NaN 3.0
But as you can see the cells where you have not specified the (row, column, value) tuple now are NaN
. This means if you try df["b"]/df["c"]
you will get NaN values out as you are trying a linear operation with a NaN value.
In : df["b"]/df["c"]
Out:
0 NaN
1 NaN
2 NaN
dtype: float64
The converse is if you wanted to insert the items under one column. You'd now need a column header for this (0 in the below):
df = pd.DataFrame()
df.loc["a", 0] = 1234
df.loc["b", 0] = 2
df.loc["c", 0] = 3
Result:
In : df
Out:
0
a 1234.0
b 2.0
c 3.0
Now in inserting the value for "t" you'd need to specify exactly which cells you are referring to (note that pandas won't perform vectorised row operations in the same way that it performs vectorised columns operations).
df.loc["t", 0] = df.loc["b", 0]/df.loc["c", 0]
Upvotes: 0
Reputation: 640
You have started by initialising an empty DataFrame:
# Initialising an empty dataframe
df = pd.DataFrame()
# Print the DataFrame
print(df)
As next you've created a column inside the empty DataFrame:
df["a"] = 1234
print(df)
But you never added values to the existing column "a" - f.e. by using a dictionary (key: "a" and value list [1, 2, 3, 4]:
df = pd.DataFrame({"a":[1, 2, 3, 4]})
print(df)
In case a list of values is added each value will get an index entry.
Upvotes: 1
Reputation: 484
I think, you have to initialize DataFrame like this.
df = pd.DataFrame(data=[[1234, b, c, b/c]], columns=list("abct"))
When you make DataFrame with no initial data, the DataFrame has no data and no columns. So you can't append any data I think.
Upvotes: 3