9113303
9113303

Reputation: 872

How to make different length list to a single dataframe in python?

Structure of my data is in this form.

data1:

['https://www.fullstackpython.com/',
 ['https://www.fullstackpython.com/table-of-contents.html',
  'https://www.fullstackpython.com/blog.html'],
 [['Introduction',
   'Development Environments',
   'Web Development ',
   'Web App Deployment',
   'Data',
   ''],
  ['5 Years of Full Stack Python',
   'GitPython and New Git Tutorials ',
   'First Steps with GitPython']],
 [[0.0,
   0.0,
   0.25,
   0.29,
   0.25,
   0.25],
  [0.0, 1.0, 0.19]]]

How to make different length list to a single Data frame in python? I have tried with pandas DataFrame .But it wont work since each list in data1 has different length. Same problem for the zip function also. I am expecting this structure for the list of data, not necessary as a table ,but as a data frame. I am trying some general approach ,so that any similar information can be fed into this format.

expecting this structure for the list of datas

Upvotes: 0

Views: 618

Answers (1)

Garrett
Garrett

Reputation: 49816

If the source data format is consistent with the example, the "rows" could be looped over in the following way, and then converted to a DataFrame.

In [1]: src = ['https://www.fullstackpython.com/',
   ...:  ['https://www.fullstackpython.com/table-of-contents.html',
   ...:   'https://www.fullstackpython.com/blog.html'],
   ...:  [['Introduction', 'Development Environments', 'Web Development ',
   ...:    'Web App Deployment', 'Data', ''],
   ...:   ['5 Years of Full Stack Python',
   ...:    'GitPython and New Git Tutorials ',
   ...:    'First Steps with GitPython']],
   ...:  [[0.0, 0.0, 0.25, 0.29, 0.25, 0.25],
   ...:   [0.0, 1.0, 0.19]]]

In [2]: site, pages, nested_titles, nested_values = src

In [3]: data = []
   ...: for page, titles, values in zip(pages, nested_titles, nested_values):
   ...:     for title, value in zip(titles, values):
   ...:         data.append((site, page, title, value))
   ...: df = pd.DataFrame(data, columns=['Site', 'Page', 'Title', 'Value'])

In [4]: df
Out[4]:
                               Site                                               Page                             Title  Value
0  https://www.fullstackpython.com/  https://www.fullstackpython.com/table-of-conte...                      Introduction   0.00
1  https://www.fullstackpython.com/  https://www.fullstackpython.com/table-of-conte...          Development Environments   0.00
2  https://www.fullstackpython.com/  https://www.fullstackpython.com/table-of-conte...                  Web Development    0.25
3  https://www.fullstackpython.com/  https://www.fullstackpython.com/table-of-conte...                Web App Deployment   0.29
4  https://www.fullstackpython.com/  https://www.fullstackpython.com/table-of-conte...                              Data   0.25
5  https://www.fullstackpython.com/  https://www.fullstackpython.com/table-of-conte...                                     0.25
6  https://www.fullstackpython.com/          https://www.fullstackpython.com/blog.html      5 Years of Full Stack Python   0.00
7  https://www.fullstackpython.com/          https://www.fullstackpython.com/blog.html  GitPython and New Git Tutorials    1.00
8  https://www.fullstackpython.com/          https://www.fullstackpython.com/blog.html        First Steps with GitPython   0.19

Upvotes: 1

Related Questions