Reputation: 530
I have a DataFrame inside of a function:
using DataFrames
myservs = DataFrame(serverName = ["elmo", "bigBird", "Oscar", "gRover", "BERT"],
ipAddress = ["12.345.6.7", "12.345.6.8", "12.345.6.9", "12.345.6.10", "12.345.6.11"])
myservs
5x2 DataFrame
| Row | serverName | ipAddress |
|-----|------------|---------------|
| 1 | "elmo" | "12.345.6.7" |
| 2 | "bigBird" | "12.345.6.8" |
| 3 | "Oscar" | "12.345.6.9" |
| 4 | "gRover" | "12.345.6.10" |
| 5 | "BERT" | "12.345.6.11" |
How can I write the function to take a single parameter called server
, case-insensitive match the server
parameter in the myservs[:serverName]
DataArray, and return the match's corresponding ipAddress
?
In R this can be done by using
myservs$ipAddress[grep("server", myservs$serverName, ignore.case = T)]
I don't want it to matter if someone uses ElMo
or Elmo
as the server
, or if the serverName
is saved as elmo
or ELMO
.
Upvotes: 2
Views: 128
Reputation: 530
I referenced how to accomplish the task in R and tried to do it using the DataFrames
pkg, but I only did this because I'm coming from R
and am just learning Julia
. I asked a lot of questions from coworkers and the following is what we came up with:
This task is much cleaner if I was to stop thinking in terms of
vectors
inR
.Julia
runs plenty fast iterating through a loop.Even still, looping wouldn't be the best solution here. I was told to look into Dicts (check here for an example).
Dict()
,zip()
,haskey()
, andget()
blew my mind. These have many applications.My solution doesn't even need to use the
DataFrames
pkg, but instead uses Julia'sMatrix
andArray
data representations. By usinglet
we keep the global environment clutter free and the server name/ip list stays hidden from view to those who are only running the function.In the sample code, I'm recreating the server matrix every time, but in reality/practice I'll have a permission restricted delimited file that gets read every time. This is OK for now since the delimited files are small, but this may not be efficient or the best way to do it.
# ONLY ALLOW THE FUNCTION TO BE SEEN IN THE GLOBAL ENVIRONMENT
let global myIP
# SERVER MATRIX
myservers = ["elmo" "12.345.6.7"; "bigBird" "12.345.6.8";
"Oscar" "12.345.6.9"; "gRover" "12.345.6.10";
"BERT" "12.345.6.11"]
# SERVER DICT
servDict = Dict(zip(pmap(lowercase, myservers[:, 1]), myservers[:, 2]))
# GET SERVER IP FUNCTION: INPUT = SERVER NAME; OUTPUT = IP ADDRESS
function myIP(servername)
sn = lowercase(servername)
get(servDict, sn, "That name isn't in the server list.")
end
end
# Test it out
myIP("SLIMEY")
#>"That name isn't in the server list."
myIP("elMo")
#>"12.345.6.7"
Upvotes: 3
Reputation: 11932
Here's one way:
julia> using DataFrames
julia> myservs = DataFrame(serverName = ["elmo", "bigBird", "Oscar", "gRover", "BERT"],
ipAddress = ["12.345.6.7", "12.345.6.8", "12.345.6.9", "12.345.6.10", "12.345.6.11"])
5x2 DataFrames.DataFrame
| Row | serverName | ipAddress |
|-----|------------|---------------|
| 1 | "elmo" | "12.345.6.7" |
| 2 | "bigBird" | "12.345.6.8" |
| 3 | "Oscar" | "12.345.6.9" |
| 4 | "gRover" | "12.345.6.10" |
| 5 | "BERT" | "12.345.6.11" |
julia> grep{T <: String}(pat::String, dat::DataArray{T}, opts::String = "") = Bool[isna(d) ? false : ismatch(Regex(pat, opts), d) for d in dat]
grep (generic function with 2 methods)
julia> myservs[:ipAddress][grep("bigbird", myservs[:serverName], "i")]
1-element DataArrays.DataArray{ASCIIString,1}:
"12.345.6.8"
EDIT
This grep
works faster on my platform.
julia> function grep{T <: String}(pat::String, dat::DataArray{T}, opts::String = "")
myreg = Regex(pat, opts)
return convert(Array{Bool}, map(d -> isna(d) ? false : ismatch(myreg, d), dat))
end
Upvotes: 2