Reputation: 3028
I want to subset dataframe in Julia. I have DataArrays.DataArray{String,1} named "brokenDf" which contains serial id that I want to remove from dataframe and dataframe "df".
The closest I've got is "findin"
df[findin(df[:serial],brokenDf),:];
but I don't know how I can flip it over after this or if we have NOT IN
command in Julia. So, it would work like findNOTin()
.
Any suggestions will be appreciated.
Upvotes: 1
Views: 1469
Reputation: 1371
A solution using list comprehension would be:
df = df[[!(i in brokenDf) for i in df.serial], :]
which will give you the filtered DataFrame where df.serial
is not in brokenDf
.
Upvotes: 0
Reputation: 5063
One solution would be to use map()
and and create a Bool
array to subset the rows of your dataframe:
using DataFrames
df = DataFrame(serial = 1:6, B = ["M", "F", "F", "M", "N", "N"]);
broken = [1,2,5];
df[DataArray{Bool}(map(x -> !(x in broken), df[:serial])),:]
The output is:
3×2 DataFrames.DataFrame
│ Row │ serial │ B │
├─────┼────────┼─────┤
│ 1 │ 3 │ "F" │
│ 2 │ 4 │ "M" │
│ 3 │ 6 │ "N" │
Note that !
negates your boolean condition, so !true == false
.
Upvotes: 1
Reputation: 945
The below should do what you want:
using DataFrames
df = DataFrame(A = 1:6, B = ["M", "F", "F", "M", "N", "N"]);
# Rows where B .== "M"
f1 = find(df[:, 2] .== "M");
# Rows where B is not "M"
f2 = find(df[:, 2] .!= "M");
# Rows where B is not "M" and is not "F"
f3 = reduce(&, (df[:, 2] .!= "F", df[:, 2] .!= "M"));
The latter can be automated writing a function:
# Define function
function find_is_not(x, conditions)
temp = sum(x .!= conditions, 2);
res = find(temp .== length(conditions));
return res;
end
# Rows where B is not "M" and is not "F" (with find_is_not)
f4 = find_is_not(df[:, 2], ["M" "F"]);
Upvotes: 1