Reputation: 21
I am doing some work for a fantasy golf website.
Very brief background: in professional golf there is a big emphasis on course history when evaluating how a player should do - there are some courses that just suit a golfer's game and you see him consistently performing well there and outperforming finishes throughout the year at other courses. There is some evidence to suggest that many courses on the tour have "corollary courses" - that is, say player 1, 2, 3 all finish in the top 20 for 3 years straight at Course A. Now, those 3 players also finish in the top 30 at Course B and C 80% of the time over a six year stretch. That would then be evaluated further as a potential corollary course. Obviously this is a very basic example so don't read too much into the math behind that, just trying to give you an idea.
Here is what I want to do and have no idea how to go about doing, haha...
I want to build a database or spreadsheet or whatever it may be, that tracks the top 30 at every event on the PGA tour. I want it to factor in the year, course name, player name, player finishing position, and player's average score per round. I would go back and input this information every time unless I have some sort of scraper do it for me, but I'd rather do it myself as I don't think it's overly time consuming as a project over a few weeks and it actually helps me to go back and check these things anyways.
From there, let's say I have this database of information stored somewhere. It's March 2017 and Tournament A approaches. What I then want to do is be able to look through this database and see - ok, in 2016 these 5 players stand out as having played well at Tournament A - what other courses did those 5 players all make the top 30 at that year? Or what courses have a few of those players all done well at over the past 5 years?
Yes, this may be time consuming, but it's not necessarily something that needs to be done soon. I want to build it up over time and add to it.
Ultimately my questions as a tech noob (that learns fast) in this area:
1) How do I build this "database" and add information to it?
2) How do I go about referencing it and using the information?
Upvotes: 1
Views: 154
Reputation: 108
It looks like you'll need four tables - Players, Courses, Tournaments, and Results. Courses and Tournaments should be separate because every once in awhile, a course is used twice during the season (e.g. Pebble Beach when it hosts the U.S. Open).
The Players table will contain demographic variables for each player (name, age, etc). The Courses table will contain variables for course-specific information (name, distance, fairway grass, greens grass, designer, etc). The Tournaments table will contain variables for name, date, weather, etc AND will relate to the Courses table via a foreign key to the Courses table primary key.
The Results table will, in essence, be a linking table for the many-to-many relationship between the Players and Tournaments tables. It will also have variables for finish-position, final-score, fairways-hit, g-i-r, putts, etc.
How you add information to the database and then "reference" and organize the information will depend upon what database application (MSSQL, MySQL, etc) that you plan to use but will largely make use of INSERT and SELECT queries.
Upvotes: 1