Reputation: 1195
I am afraid that this question might be a little too broad, but I have little-to-none knowledge when it comes to database design, so I will explain what I have so far before I get to the question.
I have an excel file, which I exported into a csv file, which has the following (simplified) layout:
Region, License Number in 1st Year, License Number in 2nd Year, License Number in 3rd Year...
The idea is that the licenses are incremented and at the end of every year, I have a record of what is the highest license at the time for each region. Because there are almost as many regions as there are years, it makes sense the previous layout for an excel file because we have the data displayed in what is close to be a square, but for the database I believe that it could be easier to deal with it if I reduce the number of columns and I would have this:
Region, Year, License Number
Assuming that this is right (again, I am clueless when it comes to database design) my question is the following: how can I possibly convert all the data to the suggested format without having to manually create the (many) entries for each region and copy all the licenses into the corresponding years?
The goal is to be able to tell what year a license was issued by simply looking at the license number
Upvotes: 0
Views: 117
Reputation: 95572
Looks like you're on the right track. You'll want a primary key on {region, year}.
If you're relatively new to databases, the procedure below will be the easiest both for you to understand and for you to get right the first try.
One problem is that you don't appear to be storing the years. If 1st year means the same year for every value in that Excel column--that is, 1st year doesn't mean 2011 for one row, and 1999 for another--then you might be able to massage the data into shape easily just using a second worksheet.
If you think you're comfortable running queries, you can try this procedure.
Create a table for your data. I'm guessing at the data types.
create table your_table_name ( region varchar(35) not null, year integer not null check (year >= 1955), license_num integer not null check (license_num >= 0) );
Export your spreadsheet to CSV file. Consider changing the column names to the actual years.
Insert data into your_table_name.
insert into your_table_name (region, year, license_num)
select region_column_name, 1955, license_num_column_for_1955
from imported_table_name
Note carefully that you'll need to adjust the literal year and the name of the column for that year each time you run the query.
Upvotes: 1
Reputation: 16677
well.. i'm not real clear on License Number - i think you mean a total (int) count of the number of licenses for the region in that year.
you probably have more info about each region as well.
i assume also that you do not have each individual license, but instead you are just maintaining the count. (if you want to store each license then this answer would be different)
so you will need a region table - something like:
region
-------
region_id
name
other_info_possibly
then either a license table to store each license - like this
license
---------
license_id
region_id
license_issue_date
license_value
other_info_possibly
or a summary maybe like this:
license_summary
----------------
region_id
license_count
year
next, you need to decide where and when to manipulate your original data: before or after importing to the database.
it might be easier after... in this case, you need a place to dump it in 'raw' maybe :
raw_data
---------
region
year1_count
year2_count
-- etc
then you can write inserts from this raw_data table to populate the other tables.
Upvotes: 0