Ralph
Ralph

Reputation: 3029

Difference amongst database jargon

I just completed a database course and we went deeply into the internals of databases. I am trying to connect the dots of all the jargon we learned in the course, but I am feeling I am missing some of the pieces. Here is what I know:

I need clarification and guidance on my understanding, I want to connect the dots.

Upvotes: 1

Views: 80

Answers (1)

reaanb
reaanb

Reputation: 10066

  1. Conceptual, logical and physical models are widely misunderstood. In my opinion:

    • Conceptual modeling is done in terms of the business domain, e.g. customers, employees, invoices, etc. We use this level to analyze business requirements and communicate with stakeholders. I use Chen-style ER diagrams for conceptual data modeling. Conceptual models are neutral with regard to implementation models.

    • Logical modeling is about formal logic, relational theory and consistency. Our focus is on domains, relations, dependencies, normalization, redundancy, etc. Relational diagrams are available, but I mostly just use plain text.

    • Physical modeling is about how to implement a logical model in a DBMS or physical system. If I'm targeting a SQL DBMS, my physical model will contain tables, data types, foreign key constraints, etc. Physical models can use any DBMS-specific features, such as certain special kinds of indexes or placement rules.

  2. Database management systems provide and manage all access to the data, they don't just optimize queries. They manage structure, storage, retrieval, integrity, optimization, authentication, authorization, and more. As you say, DBMS and database is often conflated. If you're comfortable with OOP, think of it this way: the DBMS is the object to which we pass messages. The database is its encapsulated state.

  3. Some DBMSs are built on top of a file system, and some bypass the file system completely. Some are even built on top of other DBMSs.

  4. Data isn't only stored on disks. In fact, the concept has nothing to do with storage. "Data" is the plural of "datum", a Latin word for "something given". Data first entered English in the context of statistical tables. Mathematically, data is association - a set of values in a domain, a variable containing a value, a relation between sets, etc. In computer systems, data generally refers to encoded values, which can be stored or communicated between functions and processes.

  5. Map/reduce isn't limited to distributed databases, rather think of it as distributable data processing. It can work with centralized or distributed databases. While some systems are based on or implement the map/reduce technique, it's not limited to those systems. Libraries are available in many languages, and the technique can be implemented by anyone with the required knowledge.

I hope this helps. Let me know if you need further clarification.

Upvotes: 1

Related Questions