Reputation: 616
I am working on a batch processing application that allows users to submit requests for information about particular vehicles. Users can submit a request either using a VIN or a License Plate/State combination. I proposed the following table structure:
VehiclesToBeProcessed
vehicle_id(fk)|user_id(fk)|status|start_time
Vehicles
vehicle_id|VIN|plate|state
vehicle_id(fk)|user_id(fk)|status|start_time
Vehicles
vehicle_id(pk)|field|value
where an entry in Vehicles would either consist of one row for a vin:
1|"vin"|"123
or two rows for a plate/state:
2|"plate"|"abc 123"
2|"state"|"NY"
I thought that the first solution would be much easier to query without having any significant downside. Which design should be preferred? Are guaranteed null fields really an indicator of bad design?
Upvotes: 2
Views: 87
Reputation: 48246
Nulls are fine. They are particularly useful for Single Table Inheritance, and if your system needs "Draft" entities.
If you use a quality database like Postgres, there is no storage penalty for nulls.
Anyways, if the problem is "we need A OR B, and A and B are pretty darn similar" then the answer is almost always Table Inheritance. If you want to move fast then use Single Table Inheritance. If NULLs make you sad, then use Class Table Inheritance.
--STI:
create table vehicle_identifiers (
id int primary key,
type text not null check (type in ( 'VIN', 'STATE_N_PLATE' ) )
vin null,
state char(2) null,
plate text null,
check ( ( type='VIN' and vin is not null ) or ( type='STATE_N_PLATE' and state is not null and plate is not null ) )
);
--CTI:
create table vehicle_identifiers (
id int primary key
);
create table vehicle_identifiers_vin (
id int primary key references vehicle_identifiers(id),
vin text not null
);
create table vehicle_identifiers_state_n_plate (
id int primary key references vehicle_identifiers(id),
state text not null,
plate text not null
);
Upvotes: 0
Reputation: 18408
What your colleague proposed is about the ultimate antipattern in database design.
Google for Bill Karwin's "antipatterns" book and for "EAV".
Ask your colleague how he proposes to enforce that "plate" and "state" values always appear in pairs in his database. If he points to the application code, ask him how he proposes to enforce that the database will only ever get updated through his application.
Your solution is a thousand times better than his. Still "better" (from the perspective of relational purity which involves avoiding all nulls) is to give each type of request its own table :
VehicleQueriesByVIN
user_id(fk)|status|start_time|VIN
VehicleQueriesByPlate
user_id(fk)|status|start_time|plate|state
If historical trace is to be kept of the statuses over time for each query, that stuff has to be singled out in its own table.
Upvotes: 2
Reputation: 15175
Think of it as multiple ways to identify a vehicle. You vehicle has one or many identities. The local police may identify your vehicle using LPN while the parking authority may use permit badges or active/passive transponders, furthermore the dmv probably relies on vrn numbers.
If you really want to build a flexible way to bind a vehicle to multiple identities I would use an Identity type table so a vehicle can have one or many identities.
VehicleIdentity
VehicleIdentity PK
VehicleID FK
IdentityValue
IdentityType (Type)
StateID??
Vehicle
VehicelID PK
I updated the answer by removing a table that I see would be of no use :)
Upvotes: 0
Reputation: 81684
In a word: no. This is a case of misplaced optimization. His schema will actually take up more space on average, due to storing the strings; and of course the more complex code and queries will have worse performance.
Upvotes: 0