user7330540
user7330540

Reputation:

Box-plot in Pandas

I have two csv files each of them has one column. That column has shared information between them like PassengerId,Name,Sex,Age. etc.

I am trying to draw a graph box plot of the ages of the passengers distribution per title(Mr, Mrs etc.). I get an error. how to pass the error that the plot can be drawn ?

import csv as csv
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
csv_file_object = csv.reader(open('test.csv', 'r')) 

header = next(csv_file_object)
data=[] 

for row in csv_file_object:
    data.append(row)
data = np.array(data) 

csv_file_object1 = csv.reader(open('train.csv', 'r')) 
header1 = next(csv_file_object1) 
data1=[] 

for row in csv_file_object:
    data1.append(row)
data1 = np.array(data1)


Mergerd_file = header.merge(header1, on='PassengerId')

df = pd.DataFrame(Mergerd_file, index=['pAge', 'Tilte'])

df.T.boxplot(vert=False)
plt.subplots_adjust(left=0.25)
plt.show()

I get error this error

  ---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-23-0d7fafc1fcf9> in <module>()
     21 
     22 
---> 23 Mergerd_file = header.merge(header1, on='PassengerId')
     24 
     25 df = pd.DataFrame(Mergerd_file, index=['pAge', 'Tilte'])

AttributeError: 'list' object has no attribute 'merge'

Upvotes: 2

Views: 2337

Answers (2)

jezrael
jezrael

Reputation: 863741

I think you need read_csv first, then concat both DataFrames and last create boxplot:

df1 = pd.read_csv('el/test.csv')
print (df1.head())

df2 = pd.read_csv('el/train.csv')
print (df2.head())

df = pd.concat([df1, df2])
df['Title'] = df.Name.str.extract(', (.*)\.', expand=False)
print (df.head())

df[['Age','Title']].boxplot(vert=False, by='Title')
plt.subplots_adjust(left=0.25)
plt.show()

Upvotes: 2

The code you're using is for Python 2, yet you're running Python 3. In Python 3 (and recommended in Python 2.6+), the proper way to advance iterator is to use

header = next(csv_file_object1)

Furthermore, the file should be opened in text mode 'r', not 'rb'.

Upvotes: 2

Related Questions