gromit1
gromit1

Reputation: 587

What should I use for a Database?

My vb.net code calculates the growth rate of a company's stock price for every quarter from 1901 to present and stores it in a datatable. This takes a while to perform (10-15 minutes). I would like to save the information in the datatable after it is calculated so that I don't have to recalculate past growth rates every time I run the program. When I open my program I want the datatable to contain any growth rates that have already been calculated so I only have to calculate growth rates for new quarters.

Should I store my datatable in a database of some kind or is there another way to do this? My datatable is quite large. It currently has 450 columns (one for each quarter from 1901 to present) and can have thousands of rows (one for each company). Is this too big for Microsoft Access? Would Microsoft Excel be an option?

Thanks!

Upvotes: 1

Views: 267

Answers (5)

RandomUs1r
RandomUs1r

Reputation: 4190

I would change the database design to:

  • ID
  • Quarter
  • Year
  • CompanyName
  • Value1
  • Value2
  • Value3

as your columns and start saving it as a vertical table.

Then, you don't have as much data as you think, so I'd recommend something free like mysql, or even nosql, since you're not doing anything but storing and retrieving the data. Any text based file: xml, csv, .xls that you use is going to be way slower because the entire file needs to get loaded into memory for you to be able to parse it.

Upvotes: 3

Branko Dimitrijevic
Branko Dimitrijevic

Reputation: 52107

First of all, it's unclear you actually need a database. If you don't need things such as concurrent access, client/server operation, ACID transactions etc... you might as well just implement your cache using the file system.

If you conclude you do need a DBMS, there are many good choices, including free such as: PostgreSQL, MS SQL Server Express, Oracle Express, MySQL, Firebird, SQLite etc... or commercial such as: Oracle, MS SQL Server, IBM DB2, Sybase etc...

I suggest you make your data model flexible, so you don't have to add new column for each new quarter:

enter image description here

This model is also well suited for clustering (if your DBMS of choice supports it), so the calculations belonging to the same company are stored physically close together in the database, potentially lowering the I/O during querying. Alternatively, you may choose to cluster on year/quarter.

Upvotes: 4

MonkeyDoug
MonkeyDoug

Reputation: 463

I have a great deal of experience with stock data. Having tested quite a few methods, I think for a simple free method you should try SQL Server. The amount of data you are working with is just too much for Access (I imagine this is not the only calculations you would like). You can use SQL Server Express for free.

For this design I would create a database within SQL Server named HistoricalGrowthRate. I would have a table for each stock symbol and store the data in there.

One way to accomplish this is to have a separate database with a table that contains all the symbols you wish to follow (if you don't have can use the CompanyList.csv from Nasdaq). Loop through each symbol in that table and run a create table in HistoricalGrowthRate. When you wish to populate the values, simply loop again and insert your values. You could also just export from Access, which ever is faster for you.

This will decrease the load when you call for the information and provide an easy way to access the info. So, if you want the historical growth rate for AAPL, you simply set the connection string to your HistoricalGrowthRate database, refrence table AAPL and the extract the values.

Upvotes: 1

Ricardo
Ricardo

Reputation: 11

I think that the main problem might be the way you designed the database. A column for each quarter doesn't sound very good practice, specially when you have to change your DB schema every new quarter.

You could start with a MS Access database and then if you have any performance problems with it, migrate to a SQL Server database or something.

Again, I think that you should take a carefull look at your database design.

Upvotes: 1

Gyhth
Gyhth

Reputation: 1165

Excel has a limit in regards to sizes of the sheets, and you shouldn't really ever use it as an explicit "database" for anything wish you wish to port over to different structures. It's good for things like spreadsheets and accounting in general, but you shouldn't use it for an absolute-truth database as is understood in computing. Also, Excel has a limit on the number of records that can be contained: Worksheet size 65,536 rows by 256 columns as of 2003

Access may work for this, but with the number of records you're looking at, you'll probably begin to experience issues with file sizes, slowdowns, and just general things like that. In situations when you start having more than 3,000 records at a time, it's probably better to use one of the big RDBMs or something like that; Oracle, MySQL, SQL Server, etc.

Upvotes: 2

Related Questions