Julia iterate over rows of dataframe

Question

I am trying to iterate over the rows of a DataFrame in Julia to generate a new column for the data frame. I haven't come across a clear example of how to do this. In R this type of thing is vectorized but from my understanding not all of Julia's operations are vectorized so I need to loop over the rows. I know I can do this with indexing but I believe there must be a better way. I want to be able to reference the column values by name. Here is that I have:

test_df = DataFrame( A = [1,2,3,4,5], B = [2,3,4,5,6])
test_df["C"] = [ test_df[i,"A"] * test_df[i,"B"] for i in 1:size(test_df,1)]

Is this the Julia/DataFrames way of doing this? Is there a more Julia-eque way of doing this? Thanks for any feedback.

John Myles White · Accepted Answer

You'd be better off doing test_df[i,"A"] .* test_df[i,"B"]. In general, Julia uses a dot prefix to indicate operations that are elementwise. All of these element-wise operations are vectorized.

You also don't want to use an Array comprehension since you probably want a DataArray as your output. There are no DataArray comprehensions for now since comprehensions are built into the Julia parser, which makes them hard to override in libraries like DataArrays.jl.

Julia iterate over rows of dataframe

Answers (2)

Related Questions