Reputation:
I have two csv files each of them has one column. That column has shared information between them like PassengerId,Name,Sex,Age. etc.
I am trying to draw a graph box plot of the ages of the passengers distribution per title(Mr, Mrs etc.). I get an error. how to pass the error that the plot can be drawn ?
import csv as csv
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
csv_file_object = csv.reader(open('test.csv', 'r'))
header = next(csv_file_object)
data=[]
for row in csv_file_object:
data.append(row)
data = np.array(data)
csv_file_object1 = csv.reader(open('train.csv', 'r'))
header1 = next(csv_file_object1)
data1=[]
for row in csv_file_object:
data1.append(row)
data1 = np.array(data1)
Mergerd_file = header.merge(header1, on='PassengerId')
df = pd.DataFrame(Mergerd_file, index=['pAge', 'Tilte'])
df.T.boxplot(vert=False)
plt.subplots_adjust(left=0.25)
plt.show()
I get error this error
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
<ipython-input-23-0d7fafc1fcf9> in <module>()
21
22
---> 23 Mergerd_file = header.merge(header1, on='PassengerId')
24
25 df = pd.DataFrame(Mergerd_file, index=['pAge', 'Tilte'])
AttributeError: 'list' object has no attribute 'merge'
Upvotes: 2
Views: 2337
Reputation: 863741
I think you need read_csv
first, then concat
both DataFrames
and last create boxplot
:
df1 = pd.read_csv('el/test.csv')
print (df1.head())
df2 = pd.read_csv('el/train.csv')
print (df2.head())
df = pd.concat([df1, df2])
df['Title'] = df.Name.str.extract(', (.*)\.', expand=False)
print (df.head())
df[['Age','Title']].boxplot(vert=False, by='Title')
plt.subplots_adjust(left=0.25)
plt.show()
Upvotes: 2
Reputation: 134066
The code you're using is for Python 2, yet you're running Python 3. In Python 3 (and recommended in Python 2.6+), the proper way to advance iterator is to use
header = next(csv_file_object1)
Furthermore, the file should be opened in text mode 'r'
, not 'rb'
.
Upvotes: 2