Reputation: 1377
I have data like this in a text file:
CLASS col2 col3 ...
1 ... ... ...
1 ... ... ...
2 ... ... ...
2 ... ... ...
2 ... ... ...
I load them using the following code:
data = readdlm("file.txt")[2:end, :] # without header line
And now I would like to get array with rows only from class 1.
(Data could be loaded using some other function if it would help.)
Upvotes: 7
Views: 8981
Reputation: 31342
Logical indexing is the straight-forward way to do filtering on an array:
data[data[:,1] .== 1, :]
If, though, you read your file in as a data frame, you'll have more options available to you, and it'll keep track of your headers:
julia> using DataFrames
julia> df = readtable("file.txt", separator=' ')
5×4 DataFrames.DataFrame
│ Row │ CLASS │ col2 │ col3 │ _ │
├─────┼───────┼───────┼───────┼───────┤
│ 1 │ 1 │ "..." │ "..." │ "..." │
│ 2 │ 1 │ "..." │ "..." │ "..." │
│ 3 │ 2 │ "..." │ "..." │ "..." │
│ 4 │ 2 │ "..." │ "..." │ "..." │
│ 5 │ 2 │ "..." │ "..." │ "..." │
julia> df[df[:CLASS] .== 1, :] # Refer to the column by its header name
2×4 DataFrames.DataFrame
│ Row │ CLASS │ col2 │ col3 │ _ │
├─────┼───────┼───────┼───────┼───────┤
│ 1 │ 1 │ "..." │ "..." │ "..." │
│ 2 │ 1 │ "..." │ "..." │ "..." │
There are even more tools available with the DataFramesMeta package that aim to make this simpler (and other packages actively under development). You can use its @where
macro to do SQL-style filtering:
julia> using DataFramesMeta
julia> @where(df, :CLASS .== 1)
2×4 DataFrames.DataFrame
│ Row │ CLASS │ col2 │ col3 │ _ │
├─────┼───────┼───────┼───────┼───────┤
│ 1 │ 1 │ "..." │ "..." │ "..." │
│ 2 │ 1 │ "..." │ "..." │ "..." │
Upvotes: 11