Reputation: 352
Dataframe df1 contains a field 'Column headers' which has the column names. I want to create another dataframe df2 which only contains column headers from 'Column headers' column of df1.
print(df1['Column header'])
>>
0 % Female
1 % Below poverty line
2 % Rural population
3 Decadal Population Growth (in %)
4 Availability of Drinking Water Source Within P...
5 Concrete Roofs (in %)
6 Houses With Electricity (in %)
7 Houses With Televisions (in %)
8 With Computer/Laptop (in %)
9 Houses With Phones (Telephone + Mobile) (in %)
10 Houses With 2 wheelers (in %)
11 Houses With cars (in %)
12 Households With Banking Services (in %)
13 Literacy Rate (in %)
14 Literacy Rate (Rural) (in %)
15 Literacy Rate (Urban) (in %)
16 Decadal Difference In Literacy Rate
17 Student: Teacher Ratio - All Schools
18 Student: Teacher Ratio - Primary
19 Student: Teacher Ratio - Upper Primary
20 Under-five Mortality Rate (Per 1000 live Births)
21 No of Dispensaries per 1,00,000 population
22 No of Doctors per 1,00,000 population
23 Total patients registered for tuberculosis tre...
24 Sex Ratio (Females Per 1000 Males)
25 Agri GSDP (%)
26 Industry GSDP (%)
27 Service GSDP (%)
28 Unemployment Rate (2011-12)
29 Rural Unemployment Rate (2011-12)
30 Urban Unemployment Rate (2011-12)
31 Per Capita Public Expenditure (in Rs)
32 Per Capita Private Expenditure (in Rs)
33 Infant Mortality Rate (IMR)
34 Maternal Mortality Rate
35 Coverage Of National Highways (Total in km)
36 Coverage Of State Highways (Total in km)
37 Coverage Of Rural Roads (Total in km)
38 Coverage Of Urban Roads (Total in km)
39 Railway Coverage (Total in km)
40 Tele-Density [Total Connections / Total Popul...
Name: Column headers, dtype: object
I want to create dataframe df2 which contains 40 columns as mentioned above. The rows in this dataframe will be populated by a different function. I tried to create df2 as follows -
df2 = pd.DataFrame() #Creating an empty dataframe
df2.columns = df1['Column header']
>>
ValueError: Length mismatch: Expected axis has 0 elements, new values have 41 elements
Is it possible to create a blank dataframe in Pandas and specify the column names afterwards?
Upvotes: 3
Views: 23022
Reputation: 23
This worked for me:
newDF = pd.DataFrame({}, columns=existingDF.columns)
Upvotes: 0
Reputation: 5708
Here is how to create an empty dataframe with custom columns:
// Example dataframe
df1 = pd.DataFrame({"Headers": ["Alpha","Beta", "Gama", "Delta"]]}, columns=["Headers"], index=range(4))
print(df1)
// Headers
// 0 Alpha
// 1 Beta
// 2 Gama
// 3 Delta
print(df1['Headers'].values)
// ['Alpha' 'Beta' 'Gama' 'Delta']
// Make empty dataframe, key here is index=None
df2 = pd.DataFrame({}, columns=df1['Headers'].values, index=None)
print(df2)
// Empty DataFrame
// Columns: [Alpha, Beta, Gama, Delta]
// Index: []
Upvotes: 0
Reputation: 210852
try this:
df2 = pd.DataFrame(columns=df1['Column header'])
but you shouldn't create empty DFs, because it's very slow to fill them up row by row. So you should collect your data first and then create your DF using precollected data.
Upvotes: 7