Combine Row()'s in Spark

Question

Seemingly simple question, but can't find the answer.

Problem: I create a function that I will pass into map() that takes a single field and creates three fields out of it. I want the output of the map() to give me a new RDD, including both the fields from the input RDD and the new/output RDD. How do I do this?

Do I need to add the key of my data into the output of the function so that I can join more output RDD back to my original RDD? Is that the proper/best practice?

def extract_fund_code_from_iv_id(holding):
    # Must include key of data for later joining
    iv_id = Row(iv_id_fund_code=holding.iv_id[:2], iv_id_last_code=holding.iv_id[-2:])
    return iv_id

Even more basic, I can't seem to combine two Row's.

row1 = Row(name="joe", age="35")
row2 = Row(state="MA")
print row1, row2

This doesn't return a new Row() like I want it to.

Thanks

Combine Row()'s in Spark

Answers (1)

Related Questions

Combine Row()&#39;s in Spark

Answers (1)

Related Questions

Combine Row()'s in Spark