Reputation: 17381
I'm developing a simple optimized JSON function. Lua uses tables to represent arrays but in JSON I need to recognize between them. The code below is used:
t={
a="hi",
b=100
}
function table2json(t,formatted)
if type(t)~="table" then return nil,"Parameter is not a table. It is: "..type(t) end
local ret=""--return value
local lvl=0 --indentation level
local INDENT=" " --OPTION: the characters put in front of every line for indentation
function addToRet(str) if formatted then ret=ret..string.rep(INDENT,lvl)..str.."\n" else ret=ret..str end end
addToRet("{")
lvl=1
for k,v in pairs(t) do
local typeof=type(v)
if typeof=="string" then
addToRet(k..":\""..v.."\"")
elseif typeof=="number" then
addToRet(k..":"..v)
end
end
lvl=0
addToRet("}")
return ret
end
print(table2json(t,true))
As you can see in JSON reference an object
is what is called a table
in Lua and it's different from an array
.
The question is how I can detect if a table is being used as an array?
Any simpler/smarter solution?
Upvotes: 24
Views: 28261
Reputation: 43
At least in luajit 2.1.0-beta3 (where I tested it), I reliably get sorted iteration with pairs() for numerical indices, therefore this should also work and could be a bit faster than https://stackoverflow.com/a/25709704/7787852
Even if the Lua reference manual says explicitly that the iteration order of pairs
cannot be relied upon.
local function is_array(t)
local prev = 0
for k in pairs(t) do
if k ~= prev + 1 then
return false
end
prev = prev + 1
end
return true
end
Upvotes: -1
Reputation: 611
You can simply test this (assuming t
is a table):
function isarray(t)
return #t > 0 and next(t, #t) == nil
end
print(isarray{}) --> false
print(isarray{1, 2, 3}) --> true
print(isarray{a = 1, b = 2, c = 3}) --> false
print(isarray{1, 2, 3, a = 1, b = 2, c = 3}) --> false
print(isarray{1, 2, 3, nil, 5}) --> true
It tests if there is any value in the "array part" of the table, then checks if there is any value after that part, by using next
with the last consecutive numeric index.
Note that Lua does some logic to decide when to use this "array part" and the "hash part" of the table. That's why in the last example the provided table is detected as an array: it is dense enough to be considered an array despite the nil
in the middle, or in other words, it is not sparse enough. Just as another answer here mentions, this is very useful in the context of data serialization, and you don't have to program it for yourself, you can use Lua underlying logic. If you would serialize that last example, you could use for i = 1, #t do ... end
instead of using ipairs
.
From my observation in Lua and LuaJIT implementation, the function next
always looks up the array part of the table first, so any non array index will be found after the whole array part, even though after that it doesn't follow any particular order. I'm not sure if this is a consistent behaviour across different Lua versions, though.
Also, it's up to you to decide if empty tables should be treated as arrays as well. In this implementation, they are not treated as arrays. You could change it to return next(t) == nil or (#t > 0 and next(t, #t) == nil)
to do the opposite.
Anyway, I guess this is the shortest you can get in terms of code lines and complexity, since it is lower bounded by next
(which I believe is either O(1) or O(logn)).
Upvotes: 4
Reputation: 3851
This is not pretty, and depending on how large and cleverly-deceptive the table is, it might be slow, but in my tests it works in each of these cases:
empty table
array of numbers
array with repeating numbers
letter keys with number values
mixed array/non-array
sparse array (gaps in index sequence)
table of doubles
table with doubles as keys
function isarray(tableT)
--has to be a table in the first place of course
if type(tableT) ~= "table" then return false end
--not sure exactly what this does but piFace wrote it and it catches most cases all by itself
local piFaceTest = #tableT > 0 and next(tableT, #tableT) == nil
if piFaceTest == false then return false end
--must have a value for 1 to be an array
if tableT[1] == nil then return false end
--all keys must be integers from 1 to #tableT for this to be an array
for k, v in pairs(tableT) do
if type(k) ~= "number" or (k > #tableT) or(k < 1) or math.floor(k) ~= k then return false end
end
--every numerical key except the last must have a key one greater
for k,v in ipairs(tableT) do
if tonumber(k) ~= nil and k ~= #tableT then
if tableT[k+1] == nil then
return false
end
end
end
--otherwise we probably got ourselves an array
return true
end
Much credit to PiFace and Houshalter, whose code I based most of this on.
Upvotes: 0
Reputation: 79
Here is more simplest check based on Lua specific #len function mechanism.
function is_array(table)
if type(table) ~= 'table' then
return false
end
-- objects always return empty size
if #table > 0 then
return true
end
-- only object can have empty length with elements inside
for k, v in pairs(table) do
return false
end
-- if no elements it can be array and not at same time
return true
end
local a = {} -- true
local b = { 1, 2, 3 } -- true
local c = { a = 1, b = 1, c = 1 } -- false
Upvotes: 7
Reputation: 52641
The simplest algorithm to differentiate between arrays/non-arrays is this one:
local function is_array(t)
local i = 0
for _ in pairs(t) do
i = i + 1
if t[i] == nil then return false end
end
return true
end
Explanation here: https://web.archive.org/web/20140227143701/http://ericjmritz.name/2014/02/26/lua-is_array/
That said, you will still have issues with empty tables - are they "arrays" or "hashes"?
For the particular case of serializing json, what I do is marking arrays with a field in their metatable.
-- use this when deserializing
local function mark_as_array(t)
setmetatable(t, {__isarray = true})
end
-- use this when serializing
local function is_array(t)
local mt = getmetatable(t)
return mt.__isarray
end
Upvotes: 8
Reputation: 2788
I wrote this function for pretty printing lua tables, and had to solve the same problem. None of the solutions here account for edge cases like some keys being numbers but others not. This tests every index to see if it's compatible with being an array.
function pp(thing)
if type(thing) == "table" then
local strTable = {}
local iTable = {}
local iterable = true
for k, v in pairs(thing) do
--if the key is a string, we don't need to do "[key]"
local key = (((not (type(k) == "string")) and "["..pp(k).."]") or k)
--this tests if the index is compatible with being an array
if (not (type(k) == "number")) or (k > #thing) or(k < 1) or not (math.floor(k) == k) then
iterable = false
end
local val = pp(v)
if iterable then iTable[k] = val end
table.insert(strTable, (key.."="..val))
end
if iterable then strTable = iTable end
return string.format("{%s}", table.concat(strTable,","))
elseif type(thing) == "string" then
return '"'..thing..'"'
else
return tostring(thing)
end
end
Upvotes: 0
Reputation: 6632
@AlexStack
if not t[i] and type(t[i])~="nil" then return false end
This code is wrong, if fails when one of the elemets is false
.
> return isArray({"one", "two"})
true
> return isArray({false, true})
false
I think the whole expression can be changed to type(t[i]) == nil
but it still will fail in some scenarios because it will not support nil values.
A good way to go, I think, is trying with ipairs
or checking whether #t
is equal to count
, but #t
returns 0 with objects and count
will be zero with empty arrays, so it may need an extra check at the beginning of the function, something like: if not next(t) then return true
.
As a sidenote, I'm pasting another implementation, found in lua-cjson (by Mark Pulford):
-- Determine with a Lua table can be treated as an array.
-- Explicitly returns "not an array" for very sparse arrays.
-- Returns:
-- -1 Not an array
-- 0 Empty table
-- >0 Highest index in the array
local function is_array(table)
local max = 0
local count = 0
for k, v in pairs(table) do
if type(k) == "number" then
if k > max then max = k end
count = count + 1
else
return -1
end
end
if max > count * 2 then
return -1
end
return max
end
Upvotes: 4
Reputation: 17381
Thanks. I developed the following code and it works:
---Checks if a table is used as an array. That is: the keys start with one and are sequential numbers
-- @param t table
-- @return nil,error string if t is not a table
-- @return true/false if t is an array/isn't an array
-- NOTE: it returns true for an empty table
function isArray(t)
if type(t)~="table" then return nil,"Argument is not a table! It is: "..type(t) end
--check if all the table keys are numerical and count their number
local count=0
for k,v in pairs(t) do
if type(k)~="number" then return false else count=count+1 end
end
--all keys are numerical. now let's see if they are sequential and start with 1
for i=1,count do
--Hint: the VALUE might be "nil", in that case "not t[i]" isn't enough, that's why we check the type
if not t[i] and type(t[i])~="nil" then return false end
end
return true
end
Upvotes: 2
Reputation: 9508
If you want fast, simple, non-intrusive solution that will work most of the times, then I'd say just check index 1 - if it exists, the table is an array. Sure, there's no guarantee, but in my experience, tables rarely have both numerical and other keys. Whether it's acceptable for you to mistake some objects for arrays and whether you expect this to happen often depend on your usage scenario - I guess it's not good for general JSON library.
Edit: For science, I went to see how Lua CJSON does things. It goes through all pairs and checks if all keys are integers while keeping the maximum key (the relevant function is lua_array_length
). Then it decides whether to serialize the table as an array or object depending on how sparse the table is (the ratio is user controlled) i.e. a table with indices 1,2,5,10 will probably be serialized as an array while a table with indices 1,2,1000000 will go as an object. I guess this is actually quite good solution.
Upvotes: 19
Reputation: 9549
No there is no built-in way to differentiate, because in Lua there is no difference.
There are already existing JSON libraries, which probably already do this (eg. Lua CJSON.
Other options are
__newindex
such that only new numerical and subsequent indices are allowed to be used.Upvotes: 4