Which database examples should one know? What are the differences?
Which one should I choose?
These are the questions this article aims to clarify.
Let’s begin!
- 1 So many databases – Why?
- 2 8 examples of databases
- 2.1 Relational databases – The enduring classic
- 2.2 Graphs – displaying complex data in a simple way
- 2.3 Object-oriented – close to the software
- 2.4 Hierarchical – structure and order
- 2.5 Columns – The powerful one
- 2.6 Key-Value – simplicity and speed
- 2.7 Hadoop – Big Data King
- 2.8 Document oriented – More than thought
So many databases – Why?
There are hundreds of databases on the market. Their task is to store, change and read data. Depending on the application, budget and hardware, you need a different database. The old world only knew databases in the form of a table. New problems need new storage concepts.
Databases are divided into 2 basic types:
Relational – SQL
The word relation from mathematics is a set of ordered pairs – pragmatically: tables. Tables consist of rows, the database entry and the columns with the column names/attributes.
Pure relational databases can only query, add, manipulate and delete records in this grid.
Not ONLY relational – NoSQL
NoSQL is a collective term for database types that are not only based on tabular structures. Depending on the type, it offers schema freedom, good scalability and new programming paradigms such as MapReduce. NoSQL never completely displace SQL databases because relational databases are best suited for certain tasks.
Which database should I choose?
The choice of database can be crucial to the success or failure of your IT project. Take your time and analyse which database is right for your project. The database is the foundation, which is very difficult to replace in the ongoing development.
- How long should the system exist?
- What are the requirements for the project?
- Can you estimate how much data in 3 months, 1 year or 10 years the database needs to hold?
Consider the following aspects:
- Scale: NoSQL databases scale better than SQL databases because they can be spread across many small servers. SQL databases are a good choice if you know that the database will not grow 100-fold by the end of its lifetime (e.g. a human resources database at a university)
- Speed: Do you need a high read speed or a high write rate for your project? A materialised view or a read and write channel(CQRS) increases SQL database performance.
- Data type: The data types to be stored can determine the database type. If you need to manage different documents (JSON), Couchbase or MongoDB are suitable. A Big Data cluster with Hadoop processes different data types in the same system.
- Accuracy or speed: The strict behavioural rules such as ACID in relational databases or the softer rule set BASE describe database properties (behaviour) that determine whether this database type is suitable for your software.
You can create a pro-contra table for your implementation and rate it with the 4 points in each case.
8 examples of databases
Relational databases – The enduring classic
Relational databases use tables to store data. Relationships can exist between tables via foreign and primary keys. The primary goal is to ensure the consistency, integrity and permanence of the storage. If an error occurs in the storage process, the database rolls back the changes.
application: Finance/transactions, business data and accounting transactions
products: MySQL, PostgresSQL
pro
- High reliability and accuracy
- fixed schema
contra
- Scales poorly for large amounts of data
- Merging tables takes up a lot of main memory
Graphs – displaying complex data in a simple way
A graph consists of nodes (entities) and edges (relationships). Attributes describe the nodes in more detail (JSON documents). The relationships can be directed (Y is son of X) or non-directed (Z and A are friends).
application: graphs describe complex relationships e.g. permissions, genealogy, social media, networks in general
products: Neo4j
pro
- Query and recognise complex relationships
- Realistic modelling
contra
- certain queries are very cumbersome
- Migrations are time-consuming
Object-oriented – close to the software
Object-oriented programmes (e.g. Java) use object-oriented databases to persist (store) the generated objects in RAM. This type of database should relieve the developer of some work. To date, this type has not become established.
application: Every object-oriented programme (Java)
products: ObjectDB
pro
- less development effort
- no branch action level in between
contra
- inflexible / portable?
- limited applications
Hierarchical – structure and order
A hierarchical database describes a tree with a root, branches and leaves containing data. This structural form subdivides the data. This idea has not caught on and relational databases have become established.
Application: The file system of an operating system corresponds to a hierarchical database. A folder structure on a computer allows the same form of storage. Some file systems work similarly to a database and prevent simultaneous writing.
products: LDAP, Windows file system
pro:
- natural partitioning and storage
- uncomplicated / no installation necessary
contra:
- Updating cumbersome
- Reading cumbersome
Columns – The powerful one
relational databases use rows to retrieve data. If a program calls the last attribute in the row, the database iterates over all attributes. Column databases address the columns with an ID. This type supports the bundling of columns into column families, which your application retrieves in this constellation at recurring intervals.
application: NoSQL with schema brings better schedulability.
products: Cassandra
pro
- fast reading
- partitioning of tables
contra
- Addressing cumbersome (column, column family, timestamp)
Key-Value – simplicity and speed
This database stores a value for a key. This value can be a number, a string, a document, an image or any other form of data. You can easily store any value that you can quickly and easily access via the key.
application: Highscore for games, caching content, PHP variable storage
products: Redis
pro
- simplicity
- speed
contra
- No knowledge of value (type, size, etc.)
- Key design determines success
Hadoop – Big Data King
The Hadoop database can handle petabytes of data. Heterogeneous data types such as JSON, CSV etc. are handled by the database’s file system (HDFS). The programmer can use MapReduce to perform calculations with the enormous amounts of data. The developer sends programming instructions to each individual computer, which calculates (Map) and then summarises (Reduce) the results. The client receives the results (if they are not too large).
application: Hadoop is necessary wherever there is a heterogeneous data structure with huge amounts of data.
products: Hadoop, Google Big Table (not available)
pro
- large amounts of data are possible
- MapReduce offers new possibilities for data processing
contra
- Reuse of MapReduce is difficult
- Non-trivial condition and queries
Document oriented – More than thought
instead of storing rows in a row, a document-oriented database stores the document barely unchanged as a BSON with a document ID. This schema allows for more efficient queries and avoids impedeance mismatch (JOINS across multiple tables). The NoSQL design determines whether the queries are performant and how the application has to work with the data.
application: e-commerce, customer databases, (film) ratings,
products: MongoDB, Couchbase
pro
- efficient query
- no JOINs
contra
- updating because of schema freedom
- redundancies