MarcoS
MarcoS

Reputation: 17721

PHP: how to index multiple keys in associative array?

I am going to build a "simple" RESTful web service with PHP. I will provide APIs to access some data (via JSON) I collect on my web server. The main data table will be read-only for public API methods, and will be written by singleton private methods on regular timed intervals. Users will be able to write some data to private tables.

I want to avoid - if possible - to add the complications of handling a database (not even SQLite); so, I am planning to serialize my data on file(s) on disk, and deserialize them in memory whenever the PHP script is called.

Loading the whole data in memory for each PHP instance will not pose too heavy burden on the web server (I hope)... (The numbers are these: main data table size is planned to have a maximum of 100k records, each with a maximum record size of 1k bytes, so the data size will have a maximum possible size of 100MB, with a usual size of 10MB; The maximum number of concurrent users will never be higher than 100; these numbers are by design, no possibility to grow bigger).

The question is: can I use a PHP associative array to perform queries on multiple keys?

An example: this is my simplified main data structure:

<?php
    $data = [
        "1" => [
            "name" => "Alice",
            "zip" => "12345",
            "many" => "A",
            "other" => "B",
            "fields" => "C",
        ],
        "2" => [
            "name" => "Bob",
            "zip" => "67890",
            "many" => "X",
            "other" => "Y",
            "fields" => "Z",
        ],
        // ...
    ];
?>

To access a record by primary key, of course, I should do:

$key = "12345";
$record = $data[$key];

But, what if I want to (efficiently, i.e. avoiding a sequential scan...) access one or more records by a different key, say "zip"? Of course these keys could contain duplicated values. The only solution I came up with is to build a new array for each secondary key to "index", and serialize it alongside the main data table...

For example:

$zip_idx = [
    "12345" => [ "1", "355", "99999", ],
    "67890" => [ "2", "732", ],
    // ...
];

and then:

$zip = "67890";
$records = $zip_idx[$zip];

So:
Do you see any issues, inconsistecies or lack of flexibility with this design?
Can you propose any smarter or more compact solution?
Do you have any consideration or objection?

Upvotes: 0

Views: 897

Answers (1)

mkempf
mkempf

Reputation: 51

I would not create any further Arrays for other "indexes".

Just make a nice class for handling queries. a query for zip could look like this

class Data{

    protected $data;

    public function getByZip($zip){
        return array_filter($this->getData(),function($item)use($zip){
             if($item['zip'] == $zip) return true;
             return false;
        });
    }

    public function setData($data){
        $this->data = $data;
    }

    public function getData($data){
        return $this->data;
    }
}

$dataArray = [
    "1" => [
        "name" => "Alice",
        "zip" => "12345",
        "many" => "A",
        "other" => "B",
        "fields" => "C",
    ],
    "2" => [
        "name" => "Bob",
        "zip" => "67890",
        "many" => "X",
        "other" => "Y",
        "fields" => "Z",
    ],
    // ...
];

$data = new Data();

$data->setData($dataArray);

$result = $data->getByZip(12345);

you can also use the userid in the array and query for it this way.

greetings

edit: for your performance question -> normal you use databases for data that can get to 100MB. The reason is - if you use your array file database - the whole file with 100MB has to be read into memory. thats not quite an issue but most provider use a max memory limit of 128MB for your application and that could lead to problems.

Upvotes: 1

Related Questions