John
John

Reputation: 405

Databases that Handle Many Columns

I'm considering converting some excel files I regularly update to a database. The files have a large number of columns. Unfortunately, many of the databases I am looking at, such as Access and PostreSQL, have very low column limits. MySQL's is higher, but I'm worried that as my dataset expands I might break that limit as well.

Basically, I'm wondering what (open source) databases are effective at dealing with this type of problem.

For a description of the data, I have a number of excel files (less than 10) with each containing a particular piece of information on some firms over time. It totals about 100mb in excel files. The firms are in the columns (about 3500 currently), the dates are in the rows (about 270 currently, but switching to a higher frequency for some of the files could easily cause this to balloon).

The most important queries will likely be to get the data for each of the firms on a particular date and put it in a matrix. However, I may also run queries to get all the data for a particular firm for a particular piece of data over every date.

Changing dates to a higher frequency is also the reason that I'm not really interested in transposing the data (the 270 beats Access' limit anyway, but increasing the frequency would far exceed MySQL's column limits). Another alternative might be to change it so that each firm has its own excel file (that way I limit the columns to some amount less than 10), but is quite unwieldy for the purposes of updating the data.

Upvotes: 0

Views: 70

Answers (1)

Tyler Eaves
Tyler Eaves

Reputation: 13133

This seems to be begging to be split up!

How using a schema like:

Firms
  id
  name

Dates
  id
  date

Data_Points
  id
  firm_id
  date_id
  value

This sort of de-composed schema will make reporting quite a bit easier.

For reporting you can easily get a stream of all values with a query like

SELECT firms.name, dates.date, data_points.value from data_points left join firms on firms.id = data_points.firm_id left join dates on dates.id = data_points.date_id

Upvotes: 3

Related Questions