kkumar
kkumar

Reputation: 203

How to fix "AttributeError: 'RDD' object has no attribute 'rfind'"?

I am working on a code that attaches a file from HDFS and sends an email. I have got the code working with a file from local folder (linux home directory) but when I change the location of attachment to HDFS location I get AttributeError: 'RDD' object has no attribute 'rfind' error. Can someone please help?

I have changed the encoding to

part = MIMEApplication("".join(f.collect()).encode('utf-8').strip(), Name=basename(f))

and also tried

part = MIMEApplication(u"".join(f.collect()), Name=basename(f))

but still got the same error

Here is my code

import smtplib
from os.path import basename
from email.mime.application import MIMEApplication
from email.mime.multipart import MIMEMultipart
from email.mime.text import MIMEText
from email.utils import COMMASPACE, formatdate

def success_mail():
    sender = "[email protected]"
    receivers = '[email protected]'
    msg = MIMEMultipart()
    msg.attach(MIMEText("Scoring completed. Attached is the latest report"))
    f=sc.textFile("/user/userid/folder/report_20190501.csv")
    part = MIMEApplication("".join(f.collect()).encode('utf-8', 'ignore'), Name=basename(f))
    part['Content-Disposition'] = 'attachment; filename="%s"' % basename(f)
    msg.attach(part)
    try:
        smtp = smtplib.SMTP('smtp.company.com')
        smtp.sendmail(sender, receivers, msg.as_string())  
        smtp.close()
        logMessage("INFO - Successfully sent email with Attachment")
    except:
        emsg =  traceback.format_exc()
        logMessage("ERROR -  Unable to send email because of :"+emsg)

Error:

AttributeError                            Traceback (most recent call last)
<ipython-input-6-5606e23c7cf8> in <module>()
     33         emsg =  traceback.format_exc()
     34         logMessage("ERROR -  Unable to send email because of :"+emsg)
---> 35 success_mail()

<ipython-input-6-5606e23c7cf8> in success_mail()
     22     msg.attach(MIMEText("Scoring completed. Attached is the latest report"))
     23     f=sc.textFile("/user/userid/folder/report_20190501.csv")
---> 24     part = MIMEApplication("".join(f.collect()).encode('utf-8', 'ignore'), Name=basename(f))
     25     part['Content-Disposition'] = 'attachment; filename="%s"' % basename(f)
     26     msg.attach(part)

/hadoop/ipython/userid/pyspark/lib64/python2.7/posixpath.pyc in basename(p)
    112 def basename(p):
    113     """Returns the final component of a pathname"""
--> 114     i = p.rfind('/') + 1
    115     return p[i:]
    116 

AttributeError: 'RDD' object has no attribute 'rfind'

Upvotes: 0

Views: 352

Answers (1)

Steven
Steven

Reputation: 15293

try to replace these 3 lines :

f=sc.textFile("/user/userid/folder/report_20190501.csv")
part = MIMEApplication("".join(f.collect()).encode('utf-8', 'ignore'), Name=basename(f))
part['Content-Disposition'] = 'attachment; filename="%s"' % basename(f)

with :

file_path="/user/userid/folder/report_20190501.csv"
f=sc.textFile(file_path)
part = MIMEApplication("".join(f.collect()).encode('utf-8', 'ignore'), Name=basename(file_path))
part['Content-Disposition'] = 'attachment; filename="%s"' % basename(file_path)

Upvotes: 1

Related Questions