Paul James
Paul James

Reputation: 530

DataArray case-insensitive match that returns the index value of the match

I have a DataFrame inside of a function:

using DataFrames

myservs = DataFrame(serverName = ["elmo", "bigBird", "Oscar", "gRover", "BERT"],
                    ipAddress = ["12.345.6.7", "12.345.6.8", "12.345.6.9", "12.345.6.10", "12.345.6.11"])
myservs
5x2 DataFrame
| Row | serverName | ipAddress     |
|-----|------------|---------------|
| 1   | "elmo"     | "12.345.6.7"  |
| 2   | "bigBird"  | "12.345.6.8"  |
| 3   | "Oscar"    | "12.345.6.9"  |
| 4   | "gRover"   | "12.345.6.10" |
| 5   | "BERT"     | "12.345.6.11" |

How can I write the function to take a single parameter called server, case-insensitive match the server parameter in the myservs[:serverName] DataArray, and return the match's corresponding ipAddress?

In R this can be done by using

myservs$ipAddress[grep("server", myservs$serverName, ignore.case = T)]

I don't want it to matter if someone uses ElMo or Elmo as the server, or if the serverName is saved as elmo or ELMO.

Upvotes: 2

Views: 128

Answers (2)

Paul James
Paul James

Reputation: 530

I referenced how to accomplish the task in R and tried to do it using the DataFrames pkg, but I only did this because I'm coming from R and am just learning Julia. I asked a lot of questions from coworkers and the following is what we came up with:

This task is much cleaner if I was to stop thinking in terms of vectors in R. Julia runs plenty fast iterating through a loop.

Even still, looping wouldn't be the best solution here. I was told to look into Dicts (check here for an example). Dict(), zip(), haskey(), and get() blew my mind. These have many applications.

My solution doesn't even need to use the DataFrames pkg, but instead uses Julia's Matrix and Array data representations. By using let we keep the global environment clutter free and the server name/ip list stays hidden from view to those who are only running the function.

In the sample code, I'm recreating the server matrix every time, but in reality/practice I'll have a permission restricted delimited file that gets read every time. This is OK for now since the delimited files are small, but this may not be efficient or the best way to do it.

# ONLY ALLOW THE FUNCTION TO BE SEEN IN THE GLOBAL ENVIRONMENT
let global myIP

  # SERVER MATRIX
  myservers = ["elmo" "12.345.6.7"; "bigBird" "12.345.6.8";
               "Oscar" "12.345.6.9"; "gRover" "12.345.6.10";
               "BERT" "12.345.6.11"]

  # SERVER DICT
  servDict = Dict(zip(pmap(lowercase, myservers[:, 1]), myservers[:, 2]))

  # GET SERVER IP FUNCTION: INPUT = SERVER NAME; OUTPUT = IP ADDRESS
  function myIP(servername)
    sn = lowercase(servername)
    get(servDict, sn, "That name isn't in the server list.")
  end
end

​# Test it out
myIP("SLIMEY")
​#>​"That name isn't in the server list."

myIP("elMo"​)
#>​"12.345.6.7"

Upvotes: 3

rickhg12hs
rickhg12hs

Reputation: 11932

Here's one way:

julia> using DataFrames

julia> myservs = DataFrame(serverName = ["elmo", "bigBird", "Oscar", "gRover", "BERT"],
                           ipAddress = ["12.345.6.7", "12.345.6.8", "12.345.6.9", "12.345.6.10", "12.345.6.11"])
5x2 DataFrames.DataFrame
| Row | serverName | ipAddress     |
|-----|------------|---------------|
| 1   | "elmo"     | "12.345.6.7"  |
| 2   | "bigBird"  | "12.345.6.8"  |
| 3   | "Oscar"    | "12.345.6.9"  |
| 4   | "gRover"   | "12.345.6.10" |
| 5   | "BERT"     | "12.345.6.11" |

julia> grep{T <: String}(pat::String, dat::DataArray{T}, opts::String = "") = Bool[isna(d) ? false : ismatch(Regex(pat, opts), d) for d in dat]
grep (generic function with 2 methods)

julia> myservs[:ipAddress][grep("bigbird", myservs[:serverName], "i")]
1-element DataArrays.DataArray{ASCIIString,1}:
 "12.345.6.8"

EDIT

This grep works faster on my platform.

julia> function grep{T <: String}(pat::String, dat::DataArray{T}, opts::String = "")
           myreg = Regex(pat, opts)
           return convert(Array{Bool}, map(d -> isna(d) ? false : ismatch(myreg, d), dat))
       end

Upvotes: 2

Related Questions