Dennis Kageni
Dennis Kageni

Reputation: 13

Handling "TypeError: Expected tuple, got str"

I've written the following code is scraping the tables from http://acuratings.conservative.org/acu-federal-legislative-ratings/?year1=1975&chamber=11&state1=0&sortable=1. The goal is to save all the tables into one dataframe

import pandas as pd
import time
from selenium import webdriver
from webdriver_manager.chrome import ChromeDriverManager

acu_browser = webdriver.Chrome(ChromeDriverManager().install())

acu_browser.get('http://acuratings.conservative.org/acu-federal-legislative-ratings/?year1=1975&chamber=11&state1=0&sortable=1'). 

time.sleep(10)


acu_html = acu_browser.page_source
acu_tables = pd.read_html(acu_html)
acu_tables = pd.concat(acu_tables)

However, the last line is giving me the following error:

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-1-16e0df40412a> in <module>
     13 acu_html = acu_browser.page_source
     14 acu_tables = pd.read_html(acu_html)
---> 15 acu_tables = pd.concat(acu_tables)
     16 


/usr/local/anaconda3/lib/python3.7/site-packages/pandas/core/reshape/concat.py in concat(objs, axis, join, ignore_index, keys, levels, names, verify_integrity, sort, copy)
    282     )
    283 
--> 284     return op.get_result()
    285 
    286 

/usr/local/anaconda3/lib/python3.7/site-packages/pandas/core/reshape/concat.py in get_result(self)
    490                     obj_labels = mgr.axes[ax]
    491                     if not new_labels.equals(obj_labels):
--> 492                         indexers[ax] = obj_labels.reindex(new_labels)[1]
    493 
    494                 mgrs_indexers.append((obj._data, indexers))

/usr/local/anaconda3/lib/python3.7/site-packages/pandas/core/indexes/multi.py in reindex(self, target, method, level, limit, tolerance)
   2423             else:
   2424                 # hopefully?
-> 2425                 target = MultiIndex.from_tuples(target)
   2426 
   2427         if (

/usr/local/anaconda3/lib/python3.7/site-packages/pandas/core/indexes/multi.py in from_tuples(cls, tuples, sortorder, names)
    487                 tuples = tuples._values
    488 
--> 489             arrays = list(lib.tuples_to_object_array(tuples).T)
    490         elif isinstance(tuples, list):
    491             arrays = list(lib.to_object_array_tuples(tuples).T)

pandas/_libs/lib.pyx in pandas._libs.lib.tuples_to_object_array()

TypeError: Expected tuple, got str

Any help will be really appreciated.

Upvotes: 0

Views: 2337

Answers (1)

pecey
pecey

Reputation: 681

I don't have a good answer to this as of now.

One hacky way around this would be to do something like the following:

accumulator_df = acu_tables[1]
for i in range(2, len(acu_tables)):
    accumulator_df = pd.concat((accumulator_df, acu_tables[i]), ignore_index = True)

However, this won't work directly. Since the column names are not the same, its not able to concat properly.

Since all the tables have 35 columns, one way around this would be to simply rename the columns to some fixed values and then concat.

Upvotes: 1

Related Questions