Julia - DataFrame advanced merging

Question

Im assembling data from multiple sources... specifically, reactions and reaction formulas

Some sources have both the reaction name and the formula, while other sources have may only have the formula, as an example, see rows 2 and 3 in the example

If I have a DataFrame w the following:

│ Row │ reaction │ formula │
├─────┼──────────┼─────────┤
│ 1   │   "a"    │    1    │
│ 2   │   "b"    │    2    │
│ 3   │   ""     │    2    │
│ 4   │   "c"    │    3    │

As the table suggest, rows 2 and 3 have the same reaction formula, but only row 2 has the reaction name. What I'd like to do, is remove those rows that have a formula, that dont have a name, but already exist someplace else with the same formula but also having the reaction name

i.e remove rows those rows which are duplicates w.r.t column 2 (formula) if, leaving the duplicate row that has the reaction name, that is, reaction name not being empty so as to get

│ Row │ reaction │ formula │
├─────┼──────────┼─────────┤
│ 1   │   "a"    │    1    │
│ 2   │   "b"    │    2    │
│ 3   │   "c"    │    3    │

Julia - DataFrame advanced merging

Answers (1)

Related Questions