Data modeling in Azure CosmosDb - Part 2

In this article, we will begin with a relational perspective to design an ecommerce application and gradually transition to a document-oriented approach.

Developers have long been accustomed to modeling data with relational databases. This paradigm is easy to understand and, in fact, allows us to implement many problems simply. Initiated by Codd in the seventies, it has since widely diffused, becoming a cornerstone technology, often unquestionably associated with SQL derivatives. However, since the late 2000s and the necessity to manage vast amounts of data efficiently, new concepts have emerged, introducing novel ways of modeling data. Shifting one's mindset to this new approach can be challenging, and it is our aim here to clarify all the notions involved.

What are the conventional approaches to data modeling ?

Traditionally, data was modeled with a relational database and involved several key steps.

  • Identify the main entities in our system (objects or concepts about which we need to store information like customers, orders, products).
  • Define the relationships between entities and determine how entities are related to each other (e.g., a customer places an order, an order contains products, etc…).
  • Apply normalization techniques to eliminate redundancy and improve data integrity. Normalize tables ensure efficient storage and prevent unnecessary duplication.

We will now witness an example in action.

How to model an ecommerce application in a relational database ?

We will start with a simple example that everyone is familiar with: an online store where people can place orders, view products, and write reviews. From a relational point of view, we first need to define the entities and the relationships between the different concepts. Let's briefly review them.

Products

Products are items sold on the online store. They are primarily characterized by their ID, name, description, unit price, image, URL and a category.

FeatureType
Idunique identifier
Namestring
Descriptionstring
UnitPricedecimal
ImageUrlstring
Urlstring
CategoryIdunique identifier
......

Customers

Customers are individuals who purchase products on the online store. They are characterized by their ID, name, address, and other essential data.

FeatureType
Idunique identifier
Namestring
Countrystring
......

Orders

Orders are placed by customers to purchase products. They are characterized by their reference and a list of order lines.

FeatureType
Referenceunique identifier
CustomerIdunique identifier
LinesList
......

Reviews

A review on a product is typically a written evaluation or assessment of a product by a customer. It often includes feedback and ratings regarding the product's features, performance, quality, and overall satisfaction.

FeatureType
ReviewIdunique identifier
CustomerIdunique identifier
ProductIdunique identifier
Commentstring
Ratinginteger
......

Once implemented in a relational database (for example, MySQL or SQL Server), the entities and relationships can look like the following.

Here are some fundamental aspects of this schema:

  • It comprises 8 tables. Typically, in SQL, it is a standard practice to create a table for each new concept or relationship between concepts and some databases can contain hundreds or thousands tables.
  • Each table adheres to a predefined and rigid schema, containing structured data. This enforces referential integrity.
  • SQL provides the capability to query a wide range of data. While some queries will execute incredibly quickly, others may be unreasonably slow.
  • SQL is a transactional store, allowing the datastore to maintain consistency, albeit at the cost of occasional performance trade-offs.
  • Scalability may pose a bottleneck.

Our objective is to transition this SQL schema to a document-oriented database schema. It's important to note that we are not passing judgment on whether SQL is superior or inferior to its document-oriented counterpart. Our aim is to illustrate how one can migrate from one model to the other, highlighting the key aspects of each. This is the topic of the next article.