phpdatabaseweb-servicesstoragedata-storage

Reputation: 8111

When to hardcode data inside the source code, when to use the database, and when to use a web service?

Consider the class below where some data related to the product and its components is hardcoded into the source code.

class ProductCharacteristics
{
    private $model;

    function __construct($model)
    {
        $this->model = $model;

        //Since there are several product models, 
        //we hardcode each model separately.
        //models are 50, 100, 200  

        //length
        $this->length[ 50] = array(5.5, 5.5, 5.5, 5.5, 5.5, 5.5, 5.5, 5.5, 5.5);
        $this->length[100] = array(5.5, 5.5, 5.5, 5.5, 5.5, 5.5, 5.5, 5.5, 5.5);
        $this->length[200] = array(5.5, 5.5, 5.5, 5.5, 5.5, 5.5, 5.5, 5.5, 5.5);

        //weights
        $this->weight[ 50] = array(20, 114, 50);
        $this->weight[100] = array(68, 192, 68);
        $this->weight[200] = array(68, 192, 68);    

        //descriptions
        $this->description[ 50] = array('3"', '3"', 6.50);
        $this->description[100] = array('6"', '6"', 6.50);
        $this->description[200] = array('6"', '6"', 6.50);

    }

    public function getLengths()
    {
        return $this->length[$this->modelNumber];
    }

    public function getWeights()
    {
        return $this->weight[$this->modelNumber];
    }

    public function getDescriptions()
    {
        return $this->description[$this->modelNumber];
    }
}

//instantiate:
$pc = new ProductCharacteristics(50);
$weight = $pc->getWeight();
print 'weight of component 1 is ' . $weight[0];
print 'weight of component 2 is ' . $weight[1];

Question 1:

Should data of this type (small, rarely changes) be encoded (placed) into the database instead. Why or why not? I am looking for more than just a Yes/No. Looking for a little bit of explanation/history/rationale.

Question 2:

Reason why I chose to hardcode it instead of putting it into the database was because I have the impression that "a call to the database for such small set of data is expensive, and prohibitive". Had I had 2MiB of such data, I would not put it into the source code of course. But since the set was small I put it into the source code with the added benefit that if any of the datum changes, the change is tracked in my source control repository. I wouldn't be able to know about the change if it happened at the database level

I thereby see that hardcoding it into the code is "not a big deal". I already run code, so having an extra file with just data in it is readily accessible.

Question: is it a "big deal" or comparatively "not a big deal" if instead encode that data in the database? That is, if hardcoding data in the source code is O(1), what is the big oh of placing it into the database instead?

Is it similar in {access time, overhead} to hardcoding data in the source code? I at least see using database as O(2) because we have to engage an outside program, the database system to get the data.

I could make a case that I can also get the data using a web service, but put it at O(3) because it is an outside system and we have to make a call to the outside system and also weight for network latency.

Upvotes: 3

Answers (5)

userlond

Reputation: 3828

Step 0.

Most of thing have been already said. Just for clarifing.

Wikipedia says:

Database is an organized collection of data.

So text file, relational database or even your old-plain-paper notebook are databases.

All kinds of databases have their pros and cons.

Paper notebook has large time of autonomous work, more flexible (you may wright text in different directions, draw pictures etc) and easier to study (only righting and reading skills are required). But for computers it's hardly readable.

Text config files provides human-readable syntax and their primary goal just to divide configuration from realisation (logic of your code).

Relational database used for better concurrent access, it provides optimal write and read speed, helps to organize data structure in terms of tables and relations between them.

Answers.

1. If you don't even plan to change (i.e. replace values, add new settings etc.) data in application or future applications, based on this class - just hardcode. It's not bad, if you teem is rather small (or you are standalone developer). It's simplier.

If you decide to make standalone config, for your data, I suggest plain php file. It's fast and easily to parse (no speciall class or caching). It doesn't do any overhead to your app's performance. This give your ability to share settings accross different classes, also your code becomes better structured.

Php configs are used by Zend Framework and Yii. Symfony prefers store configs in yml, but also supports php, xml and annotations (special kinds of comments, used for store configs).

To prevent warnings and for specify default values I use this class.

If you plan to make some frontend to edit setting (for example through html form in admin app area), use relational database. It's much better for concurrent writes than plain file. Database config is also usefull, if you have fat database layer (triggers for example).

Premature optimization is the root of all evil. [Donald Knuth]

Upvotes: 3

Machavity

Reputation: 31624

For small, static sets of data, it's negligible to store it or hard code it. You're talking one DB hit to fetch the data and then parsing time vs having it coded. The main performance gain would be that hard coded means opcache saves the data vs hitting the DB every time. Unless we're talking about an application getting hundreds of thousands of views, you're talking less than 1 second of processing on a query that most RDBMS systems (like MySQL) will cache for ready returns (writing > reading for system resource use).

I would say, given the small size, hard coding is perfectly acceptable here.

Upvotes: 1

philipxy

Reputation: 15118

Re question 1:

Put everthing you can that will be used by a query into the database/DBMS. Then the DBMS can use it for optimization, integrity and clarity.

The DBMS can optimize all queries.
Eg: If you use ORM data structure code in combination with a database query then the DBMS might have to loop through a cross product of two tables checking for weight $pc->getWeight() whereas it might have avoided a cross product by joining with ProductCharacteristics earlier. Eg: Some always true things you can tell the DBMS that help it to optimize queries are UNIQUE NOT NULL (including PRIMARY KEY) and FOREIGN KEY constraints.
You can query all the database directly via SQL.
Otherwise the DBMS has most of the data and a generic optimized interface yet you can't query involving your ORM data structure without compiling application code.
You can simplify ORM code.
Since the ORM code is translated to SQL queries, when you use only the database there is ORM functionality that is available that otherwise wouldn't be. Eg: Calculating commmulative functions of weights via SQL window functions.
You can simply query application relationships differing from your ORM data structure.
Eg: It's easy to find a certain component's weight with your ORM data structure, but not easy to find a certain weight's components. But this is equally simple via a DBMS.
You can better maintain integrity.
Eg: The DBMS table format and/or integrity constraints force the equivalent of having the lengths of your arrays the same.

The relational model was designed to solve these sorts of problems with data structures and heirarchical databases. (Read about that.) Use its power.

Re question 2:

It's a big deal. (See Re question 1.)

is it a "big deal" or comparatively "not a big deal" if instead encode that data in the database?

Your benefits are limited and restrictive.
You are thinking of particular small queries in isolation. Whereas the DBMS exists for arbitrary queries with automatic implmentation with optimization.

I have the impression that "a call to the database for such small set of data is expensive, and prohibitive".

You are slowing down non-trivial queries.
You are saving a small (in DBMS terms) constant communication & evaluation cost on small queries for large evaluation costs on large queries due to impeded DBMS optimization. The DBMS knows the table is small via statistics. Given a small table and query, all the DBMS is doing is just looping through an array in memory. (And read about SARGability.)

with the added benefit that if any of the datum changes, the change is tracked in my source control repository

You are introducing an exception.
You are reusing code but, given that all the other data has to be logged/tracked, needlessly. Indeed your code and database should be tracked together. A good DBMS has both update logging and version tracking (including code). Use it. Anyway, you can always track a DBMS UPDATE script in your source control repository.

That is, if hardcoding data in the source code is O(1), what is the big oh of placing it into the database instead

I at least see using database as O(2) because we have to engage an outside program, the database system to get the dat

Learn about big-O.
O(1) is O(2) is O(3) is constant. You mean O(1) with different constant factors. Extra levels of implementation are generally at worst constant but at best far better because of optimization using information from a larger scope.

Considering ORM data structure now is "premature optimization" ("the root of all evil"). This sort of engineering tradeoff follows empirical suspicion, investigation and demonstration followed by cost-benefit analysis (including opportunity cost).

Upvotes: 3

hofan41

Reputation: 1468

Answer 1

Data that never changes can be hardcoded, obviously.

Data that occasionally/rarely changes is data that still needs to be configurable at some point. Therefore, it should not be hardcoded because it is much easier to re-configure software than to update the source code/compile/re-deploy.

Answer 2

For 99% of cases, it is not a big deal to store data in a database. Otherwise, why would they exist? For database access it is about latency/overhead. If your database server resides on the same OS instance as your program, then there is no network latency, and the overhead will depend on a combination of your database design and the underlying storage architecture (RAM/HDD/SSD). For most projects that do not involve scale in the millions/billions, using any generic database deployment will be fine.

Upvotes: 2

Daksh Mehta

Reputation: 124

I strongly recommend to have all of the data in some sort of config files or database for sure. However, there is no restriction on having small data hard-coded, but here's how I explain..

The reason I am saying this is - no matter how much is data - small or big, you will end up editing.

If the data is hard-coded into the code, It's much likely that you will end-up having bad code quality.

My best suggestion is to do something similar, of course if not database..

Create a data file as "data.lengths.php"

<?php
    return array(
        50 => array(
             5.5, 5.5, // can have as many as you want..
        )           
    );

You can prepare same data files for other as well.

and next you can simple use it wherever you would like to use it as.

<?php

      $data['length'] = require_once(__DIR__.'/data.lengths.php'); // Assuming both files are in same directory.

Now, this way you will have good code quality and on same side you are not forcing yourself to take long path.

My 2 cents, hope this helps.

Upvotes: 0

When to hardcode data inside the source code, when to use the database, and when to use a web service?

Answers (5)

Step 0.

Answers.

Related Questions