DMill
DMill

Reputation: 303

SQL Server Database File Groups on a SAN: Relevant or Not?

I am about to build out a new SQL Server and I was planning to make extensive use of file groups. I expect heavy growth, and heavy read/write to 5 different databases on this server. I was planning on creating 2 additional file groups (one for user data and one for indexes) on each database for a total of 3 file groups per database. I was planning on splitting up the file groups among different drives/spindles. This server is a virtual server (VMWare) on an EMC SAN. I am new to SAN architecture and am I not the SAN administrator. I read a brief blurb on file groups and SANs in the book "Microsoft SQL Server 2012 Unleashed" that file groups were likely irrelevant when using SANs. Unfortunately, there wasn't much more detail than that, and I haven't found much else out there on the topic.

Is there a point to using file groups when using a SAN for storage?

If not, then why not? If so, then why?

What questions might I ask my SAN Admin on this topic?

Upvotes: 1

Views: 1530

Answers (1)

Sobrique
Sobrique

Reputation: 53498

What you need to understand is what a SAN is.

A SAN is one or more storage arrays, interconnected via a fiber channel network. Your host has special network cards - called Host Bus Adapters (HBAs) - for talking to that network. The network protocols are geared up for storage traffic, and so are well suited to high performance, low latency traffic.

The array you're talking to... well, it varies a great deal as to it's capability. Even an EMC SAN as you refer to it - might be a variety of EMC products as the storage array. Their primary purpose is consolidating storage performance.

You get a better peak performance out of 100 spindles share with 10 servers, than you would if each server had 10 spindles each. So what your storage array is basically doing is carving up that chunk of 100 spindles into logical units - and then giving them back to your host, such that each host has about the same average performance, but it's peak is 10x the size. (Or more perhaps more realistically - they might go with 50 spindles, because then you get 5x the peak, but half the cost, in return for a lower average).

Now - Filegroups. As I understand it (being a storage engineer, rather than knowing a great deal of SQL). Filegroups allow you to manage placement of data, specifically to the underlying storage.

This is something of a ticklish point - because it depends. Typically your storage array will be doing some quite clever things of it's own in order to streamline data placement and throughput. Things like some quite aggressive caching - far more than you'd be getting on a normal host - meaning that a lot of your random access workload goes at 'RAM speed' rather than 'disk speed'. It may well be striping across a lot more spindles than you'd normally expect as well.

Which as far as I can tell - is essentially what a filegroup is aiming to achieve - you manually place a file on a disk, and let SQL handle parallel IO to those disks. Your storage array is doing this for you already, and at best you'll make yourself unnecessary admin headache, and at worst you'll actually make array side optimization worse.

You probably do still want to segregate your various content types, but I would suggest you do so via different LUNs being allocated from your SAN. That's more so you can't drain space from one database by having another fill - but it also allows for a bit more flexibility when it comes to taking snapshots or clones.

What I would suggest:

  • go talk to your storage guys about the expected IO profile of your database. (IO is what's expensive on a SAN, and typically databases use more of it than 'normal' applications)
  • put each instance on a different set of LUNs - separate out DB and logs and tempdb.
  • Within vmware, you may end up with 'logical' disks on the same datastore. If performance is critical, it may be worth passing through the SAN LUNs direct to the host.

And then don't worry about it overly - if you notice a particular problem, it should be possible to retune/move individual LUNs around to improve the situation.

Upvotes: 2

Related Questions