Gazillion
Gazillion

Reputation: 4792

Why should I abstract my data layer?

OOP principles were difficult for me to grasp because for some reason I could never apply them to web development. As I developed more and more projects I started understanding how some parts of my code could use certain design patterns to make them easier to read, reuse, and maintain so I started to use it more and more.

The one thing I still can't quite comprehend is why I should abstract my data layer. Basically if I need to print a list of items stored in my DB to the browser I do something along the lines of:

$sql = 'SELECT * FROM table WHERE type = "type1"';'
$result = mysql_query($sql);

while($row = mysql_fetch_assoc($result))
{
    echo '<li>'.$row['name'].'</li>';
}

I'm reading all these How-Tos or articles preaching about the greatness of PDO but I don't understand why. I don't seem to be saving any LoCs and I don't see how it would be more reusable because all the functions that I call above just seem to be encapsulated in a class but do the exact same thing. The only advantage I'm seeing to PDO are prepared statements.

I'm not saying data abstraction is a bad thing, I'm asking these questions because I'm trying to design my current classes correctly and they need to connect to a DB so I figured I'd do this the right way. Maybe I'm just reading bad articles on the subject :)

I would really appreciate any advice, links, or concrete real-life examples on the subject!

Upvotes: 3

Views: 1812

Answers (5)

Francesco Terenzani
Francesco Terenzani

Reputation: 1391

In my point of view to print just a list of items in a database table, your snippet is the more appropriate: fast, simple and clear.

I think a bit more abstraction could be helpful in other cases to avoid code repetitions with all the related advantages.

Consider a simple CMS with authors, articles, tags and a cross reference table for articles and tags.

In your homepage your simple query will become a more complex one. You will join articles and users, then you will fetch related tag for each article joining the tags table with the cross reference one and filtering by article_id.

You will repeat this query with some small changes in the author profile and in the tag search results.

Using a abstraction tool like this, you can define your relations once and use a more concise syntax like:

// Home page
$articles = $db->getTable('Article')->join('Author a')
    ->addSelect('a.name AS author_name');
$first_article_tags = $articles[0]->getRelated('Tag');

// Author profile
$articles = $db->getTable('Article')->join('Author a')
    ->addSelect('a.name AS author_name')->where('a.id = ?', $_GET['id']);

// Tag search results
$articles = $db->getTable('Article')->join('Author a')
    ->addSelect('a.name AS author_name')
    ->join('Tag')->where('Tag.slug = ?', $_GET['slug']);

You can reduce the remaining code repetition encapsulating it in Models and refactoring the code above:

// Home page
$articles = Author::getArticles();
$first_article_tags = $articles[0]->getRelated('Tag');

// Author profile
$articles = Author::getArticles()->where('a.id = ?', $_GET['id']);

// Tag search results
$articles = Author::getArticles()
    ->join('Tag')->where('Tag.slug = ?', $_GET['slug']);

There are other good reasons to abstract more or less, with its pros and cons. But in my opinion for a big part the web projects the main is this one :P

Upvotes: 0

Martyn
Martyn

Reputation: 1476

In my opinion, the data access is one of the most important aspects to separate / abstract out from the rest of your code.

Separating out various 'layers' has several advantages.

1) It neatly organises your code base. If you have to make a change, you'll know immediately where the change needs to be made and where to find the code. This might not be so much of a big deal if you're working on a project on your own but with a larger team the benefits can quickly become obvious. This point is actually pretty trivial but I added it anyway. The real reason is number 2..

2) You should try to separate things that might need to change independently of each other. In your specific example, it is conceivable that you would want to change the DB / data access logic without impacting the user interface. Or, you might want to change the user interface without impacting on the data access. Im sure you can see how this is made impossible if the code is mixed in with each other.

When your data access layer, has a tightly defined interface, you can change its inner workings however you want, and as long as it still adheres to the interface you can be pretty certain it wont have broken anything further up. Obviously this would still need verifying with testing.

3) Reuse. Writing data access code can get pretty repetitive. It's even more repetitive when you have to rewrite the data access code for each page you write. Whenever you notice something repetitive in code, alarm bells should be ringing. Repetitiveness, is prone to errors and causes a maintenance problem.

I'm sure you see the same queries popping up in various different pages? This can be resolved by putting those queries lower down in your data layer. Doing so helps to ease maintenance; whenever a table or column name changes, you only need to correct the one place in your data layer that references it instead of trawling through your entire user interface and potentially missing something.

4) Testing. If you want to use automated tool to carry out unit testing you will need everything nicely separated. How will you test your code to select all Customer records when this code is scattered all throughout your interface? It is much easier when you have a specific SelectAllCustomers function on a data access object. You can test this once here and be sure that it will work for every page that uses it.

There are more reasons that I'll let other people add. The main thing to take away is that separating out layers allows one layer to change without letting the change ripple through to other layers. As the database and user interface are areas of an application / website that change the most frequently it is a very good idea to keep them separate and nicely isolated from everything else and each other.

Upvotes: 1

krtek
krtek

Reputation: 26607

One of the other advantage of abstracting the data layer is to be less dependent on the underlying database.

With your method, the day you want to use something else than mysql or your column naming change or the php API concerning mysql change, you will have to rewrite a lot of code.

If all the database access part was neatly abstracted, the needed changes will be minimal and restricted to a few files instead of the whole project.

It is also a lot easier to reuse code concerning sql injection or others utility function if the code is centralized in one place.

Finally, it's easier to do unit testing if everything goes trough some classes than on every pages from your project.

For example, in a recent project of mine (sorry, no code sharing is possible), mysql related functions are only called in one class. Everything from query generation to object instantiation is done here. So it's very for me to change to another database or reuse this class somewhere else.

Upvotes: 2

gideon
gideon

Reputation: 19465

I'm NOT a php person but this is a more general question so here goes.

You're probably building something small, sometimes though even something small/medium should have an abstracted data layer so it can grow better.

The point is to cope with CHANGE

Think about this, you have a small social networking website. Think about the data you'll store, profile details, pictures, friends, messages. For each of these you'll have pages like pictures.php?&uid=xxx.

You'll then have a little piece of SQL slapped in there with the mysql code. Now think of how easy/difficult it would be to change this? You would change 5-10 pages? When you'll do this, you'll probably get it wrong a few times before you test it thoroughly.

Now, think of Facebook. Think of the amount of pages there will be, do you think it'll be easier to change a line of SQL in each page!?

When you abstract the data access correctly:

  1. Its in one place, its easier to change.
  2. Therefore its easier to test.
  3. Its easier to replace. (Think about what you'd have to do if you had to switch to another Database)

Hope this Helps

Upvotes: 3

edmz
edmz

Reputation: 3360

Think of a abstracting the data layer as a way to save time in the future.

Using your example. Let's say you changed the names of the tables. You would have to go to each file where you have a SQL using that table and edit it. In the best case, it was a matter of search and replace of N files. You could have saved a lot of time and minimized the error if you only had to edit one file, the file that had all your sql methods.

The same applies to column names.

And this is only considering the case where you rename stuff. It is also quite possible to change database systems completely. Your SQL might not be compatible between Sqlite and MySQL, for example. You would have to go and edit, once again, a lot of files.

Abstraction allows you to decouple one part from the other. In this case, you can make changes to the database part without affecting the view part.

For very small projects this might be more trouble than it is worth. And even then, you should still do it, at least to get used to it.

Upvotes: 3

Related Questions