C L
C L

Reputation: 125

Creating dataframe from a list - TypeError: object of type 'int' has no len()

I am trying to create a data-frame from a list which has varying lengths for each row.

A sample of the list looks like this (which is how I would like it to)

[(dwstweets gop, broadened, base people), 1]
[(bushs campaign video, features, kat), 2]
[3]
[4]
[5]
[(president obama, wants, york), 6]
[(jeb bush, talked, enforcement), (lets, see, plan), 7]

The code I am using the try and append the list with each row to create the data-frame is:

count = 0;
df2 = pd.DataFrame();
for index, row in df1.iterrows():
  doc = nlp(unicode(row));
  text_ext = textacy.extract.subject_verb_object_triples(doc);
  mylist = list(text_ext) + [index]
  count+=1;
  df2 = df2.append(mylist, ignore_index=True)

However I get the error:

TypeError: object of type 'int' has no len()

I saw there are several questions with this error but as far as I can see they are not caused by the same thing.

How would I go about creating a data-frame with 7 columns that is unique on the index? (I know many of which will be empty for at least 3 of the columns and all columns except the index)

Thanks.

Upvotes: 1

Views: 3225

Answers (2)

jezrael
jezrael

Reputation: 863451

I suggest create list of tuples first by append by tuples without [index] and then call DataFrame constructor like:

count = 0
L = []
df2 = pd.DataFrame();
for index, row in df1.iterrows():
  doc = nlp(unicode(row))
  text_ext = textacy.extract.subject_verb_object_triples(doc)
  #remove join index 
  mylist = list(text_ext)
  count+=1;
  #append to list
  L.append(mylist)

df2 = pd.DataFrame(L, index=df1.index)
print (df2)
                                         0                  1
1  (dwstweets gop, broadened, base people)               None
2    (bushs campaign video, features, kat)               None
3                                     None               None
4                                     None               None
5                                     None               None
6           (president obama, wants, york)               None
7          (jeb bush, talked, enforcement)  (lets, see, plan)

Upvotes: 2

Jigar Patel
Jigar Patel

Reputation: 1

I believe the error could be in your for loop line in the code:

for index, row in df1.iterrows():

DataFrame.iterrows() returns an iterator object which cannot be used for defining a for loop at least in this case.

https://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.iterrows.html

Upvotes: 0

Related Questions