Reputation: 91
I am having a hard time figuring this out. I am working on a program to keep track of some data associated with a bunch of files. I am using pandas to manage the data and load/save it. For the first run, the program identifies the files with the extension, and creates a pandas dataframe with some number of columns for the data associated with each file. The number of columns and the number of rows isn't known until runtime. I want to add all the file paths to a column, but leave all other columns blank for the DataFrame, is there a good way to do this? So if the input is [val1, val2, val3,...]
then I want the DataFrame to be
[col1, col2, col,...,coln]
[val1, NaN, NaN,..., NaN]
[val2, NaN, NaN,..., NaN]
[val3, NaN, NaN,..., NaN]
Thanks for any help!
Upvotes: 1
Views: 1538
Reputation: 36545
If you create your dataframe from a dict, any extra columns specified in the columns
keyword will be initialized as null:
In [3]: pd.DataFrame({'col1':['val1','val2','val3']},
columns=['col1','col2','col3'])
Out[3]:
col1 col2 col3
0 val1 NaN NaN
1 val2 NaN NaN
2 val3 NaN NaN
Alternatively if your first column is an index, you can use this syntax:
In [4]: pd.DataFrame([], ['val1','val2','val3'], ['col1','col2','col3'])
Out[4]:
col1 col2 col3
val1 NaN NaN NaN
val2 NaN NaN NaN
val3 NaN NaN NaN
Upvotes: 2