Anthony
Anthony

Reputation: 113

Iterating over Dataframe to add one

This was done in Julia 1.1.1 On a Windows 10 machine.

I am working with a dataframe, df, with pmid as the rownames, and unique(features) as the column names. In addition I have another variable pmids, pmids[i] corresponds to features[i].

I am trying to iterate over this dataframe with 0s for each of its cells. Adding 1 to the cell depending on whether a feature shows up, in order to count the number of mentions of a feature for each pmid. In order to do this I used the following for loop.

feature_ids = unique(features)
df = hcat(df, initialize_df(feature_ids, nrow(df), 0))
for i in 1:length(features)
  pmid = pmids[i]
  feature = features[i]
  df[df[:,:pmid] .== pmid, Symbol(feature)] .+= 1
end

In julia v0.6.2 this worked however, in Julia v1.1.1 when I look at the dataframe it is still populated by zeros after the for loop. Any ideas as to what I am doing wrong

Upvotes: 2

Views: 65

Answers (1)

Bogumił Kamiński
Bogumił Kamiński

Reputation: 69819

This is most likely what should be a fix:

for i in 1:length(features)
  pmid = pmids[i]
  feature = features[i]
  v = view(df, df[:,:pmid] .== pmid, Symbol(feature))
  v .+= 1
end

Your code is not fully reproducible so I cannot test it. In several hours (hopefully) I shall release a new version of DataFrames.jl package under which your old code will work as expected.

EDIT: under DataFrames.jl v0.19 your old code should just work.

Upvotes: 2

Related Questions