bensw
bensw

Reputation: 3028

Negation of condition when subsetting in Julia

I want to subset dataframe in Julia. I have DataArrays.DataArray{String,1} named "brokenDf" which contains serial id that I want to remove from dataframe and dataframe "df".

The closest I've got is "findin"

df[findin(df[:serial],brokenDf),:];

but I don't know how I can flip it over after this or if we have NOT IN command in Julia. So, it would work like findNOTin().

Any suggestions will be appreciated.

Upvotes: 1

Views: 1469

Answers (3)

WebDev
WebDev

Reputation: 1371

A solution using list comprehension would be:

df = df[[!(i in brokenDf) for i in df.serial], :]

which will give you the filtered DataFrame where df.serial is not in brokenDf.

Upvotes: 0

niczky12
niczky12

Reputation: 5063

One solution would be to use map() and and create a Bool array to subset the rows of your dataframe:

using DataFrames
df = DataFrame(serial = 1:6, B = ["M", "F", "F", "M", "N", "N"]);

broken = [1,2,5];

df[DataArray{Bool}(map(x -> !(x in broken), df[:serial])),:]

The output is:

3×2 DataFrames.DataFrame
│ Row │ serial │ B   │
├─────┼────────┼─────┤
│ 1   │ 3      │ "F" │
│ 2   │ 4      │ "M" │
│ 3   │ 6      │ "N" │

Note that ! negates your boolean condition, so !true == false.

Upvotes: 1

merch
merch

Reputation: 945

The below should do what you want:

using DataFrames
df = DataFrame(A = 1:6, B = ["M", "F", "F", "M", "N", "N"]);

# Rows where B .== "M"
f1 = find(df[:, 2] .== "M");

# Rows where B is not "M"
f2 = find(df[:, 2] .!= "M");

# Rows where B is not "M" and is not "F"
f3 = reduce(&, (df[:, 2] .!= "F", df[:, 2] .!= "M"));

The latter can be automated writing a function:

# Define function
function find_is_not(x, conditions)
    temp = sum(x .!= conditions, 2);
    res  = find(temp .== length(conditions));
    return res;
end

# Rows where B is not "M" and is not "F" (with find_is_not)
f4 = find_is_not(df[:, 2], ["M" "F"]);

Upvotes: 1

Related Questions