Application developers can choose between two categories of database: SQL (Structured Query Language) and NoSQL (Not Only SQL). All databases store data, but there are some significant differences between SQL and NoSQL:
-
SQL databases, also known as relational databases, are like large, formalized Excel spreadsheets. They are organized into rows and columns, and the definition of tables and rows is called a schema. SQL databases have been used by developers for over 40 years.
-
NoSQL databases, or non-relational databases, are used as document stores, graph databases, key-value stores, and wide-column data stores. NoSQL databases tend to sacrifice some robustness to gain speed and scalability. They have gained popularity and widespread adoption mostly in the past decade, especially for handling large volumes of unstructured data.
Of the top 10 results on DB-Engines’ list of most popular database management systems in May 2024, seven were relational, or SQL-based.
So what are the differences between SQL and NoSQL, and which database should you use for your next application? In this article, we will explore the pros and cons of each.
SQL databases
The SQL language is an American National Standards Institute (ANSI) standard, with some dialects, like PL/SQL (in Oracle) and T-SQL (in Microsoft SQL Server). The advantage of writing ANSI-compatible SQL is that you can easily transfer your scripts to another SQL database.
Popular SQL databases include Oracle, MySQL, Microsoft SQL Server, and PostgreSQL. Below, we’ll look at some of the features that make SQL a popular choice among developers.
SQL is relational
You might think SQL databases are called “relational” because you can define relationships between records using foreign keys. However, the term actually comes from the mathematical concept of a “relation,” which is a collection of unique tuples. A tuple is just an ordered collection of values.
In an SQL database, a relation is represented as a table, with each tuple in the relation making up a row (commonly called a record) in the table. In relational databases, you can (and generally should) normalize your data by not storing it redundantly. In the past, storage was expensive, and normalizing your data saved storage.
SQL is robust
A database with a schema has pros and cons. On the one hand, you always know the entities and values your application expects. On the other hand, it’s not good at dealing with dynamic data.
Having a schema means you can validate your data. For example, the ID field must be unique and may not be NULL (empty). You can also force a foreign key relationship, meaning that a record can’t reference a record that doesn’t exist in your database.
A normalized database with validations in place keeps your data reliable. With other SQL perks, like transactions, SQL databases are generally fast, reliable, and robust.
NoSQL databases
A common misconception about NoSQL databases is that the “No” means no SQL is used in the database at all. As mentioned earlier, the “No” stands for “not only.” You may find some SQL in NoSQL databases.
Another source of confusion is that there is no single definition of a NoSQL database. In fact, there are four broad categories of NoSQL database:
- Document stores
- Graph databases
- Key-value stores
- Wide-column data stores
One of the most widely used NoSQL databases, MongoDB, ranked fifth on DB-Engines’ list and was the highest ranked of the four non-relational databases included in the top 10.
Some databases, such as Cosmos DB, span different categories, but NoSQL databases are rarely interchangeable and typically look nothing alike. One thing they generally have in common is that they sacrifice some robustness to gain speed and scalability.
Document stores
The most popular type of NoSQL database is the document store. Document stores look the most like traditional SQL databases, except there is no schema and no normalization. Instead of columns and rows, you simply have a collection of whatever it is you put in.
Adding a new field to an entity is easy, but it means some entities have this field defined while others don’t. You can store the same entity multiple times and with different values. You can also easily make a mess out of things!
Even so, these databases excel in environments with highly dynamic data and scale much better than SQL databases. Document stores can generally run on multiple servers, while SQL databases are usually tied to a single server. Since document stores don’t have all these pesky field validations, they’re lightning fast.
Popular document stores include MongoDB, DynamoDB, Couchbase, Firebase, and Cosmos DB.
Graph databases
The graph database is a specialized niche type of NoSQL database. The most common use case for this type of database is the “people you may know” example. Imagine some social website, like Facebook or LinkedIn, which shows you people that your friends know.
In a graph database, all these people are represented as nodes, and the relationships between them are represented as edges. To find all the friends of your friends, you would start with a node and simply “walk” the edges. You would first walk the edges to your friends, and then walk their friends’ edges. Assuming you had 200 or 300 friends, and each of those friends had 200 to 300 friends themselves (with some overlap), you would end up finding between 20,000 and 60,000 nodes. You could go deeper by simply checking all edges of those nodes.
With a big enough dataset, a graph database takes seconds to fetch all these friends of friends. An SQL database would quickly bog down doing this. It would need to match millions of users, each with millions of users, and all those with their own millions of users and ultimately filter billions of (double) users.
If you need a graph database, some popular ones are Neo4j, ArangoDB, and Cosmos DB.
Key-value stores
Probably the most straightforward NoSQL database is the key-value store. As its name suggests, the key-value store holds collections of key-value pairs. The value can be anything, from a numeric value to a complex object with sub-objects.
It’s not widely applicable, but the key-value store is perfect for use cases such as caching or storing session data.
Redis, Memcached, and Cosmos DB are popular key-value stores.
Wide-column data stores
The wide-column data store looks a bit like the key-value store. However, instead of having a single value, a key holds access to columns.
A value can consist of billions of columns and can be dynamic. Imagine a schema-less SQL database, or a document database, inside a key-value store.
Wide-column data stores are scalable and can hold up to petabytes of data. Their use cases vary, such as time-series data (like CPU use over time for multiple servers), financial data marketing, internet of things (IoT) data, and graph data.
Popular databases of this type include Cassandra, HBase, Bigtable, and Cosmos DB.
Other NoSQL databases
NoSQL includes other types of databases, like databases centered around flat text files. Also, keep in mind that we can classify everything we had before SQL as NoSQL. Even a search engine is a kind of NoSQL database.
Search engine databases specialize in finding data content. They typically support complex search queries, full-text search, result ranking and grouping, and distributed search for high scalability. Elasticsearch, Solr, and Splunk are popular search engines.
You may have noticed by now that Cosmos DB, a cloud database running in Azure, is a database that does just about everything. There are several multi-model databases that can store data in multiple ways. Amazon has its own DynamoDB, which is a multi-model database running in AWS.
There are some limitations to multi-model databases. You can’t use the different approaches in a single database, for example, but you can create multiple instances and use a different method on each.
NewSQL databases
Sometimes, NoSQL is your only option. However, SQL databases have caught up and now offer some NoSQL perks while still being SQL. For example, databases such as Oracle and SQL Server enable you to store dynamic JSON and even use indices and filter queries on those values.
Some databases take it a step further. Snowflake, for example, is a decentralized SQL database hosted in the cloud. It solves the challenge of SQL not being scalable while still remaining SQL based. These types of databases are often called NewSQL.
To give you an idea of how popular NewSQL databases are, Snowflake moved up a whopping 112 spots in the DB-Engines ranking between September 2020 and May 2024, climbing into the top 10 at position 9.
Other popular NewSQL databases include CockroachDB and Spark SQL.
SQL vs NoSQL: How to choose
It may be hard to choose the right database for your team. You may be told “pick the right tool for the right job.” The right tool may just be the tool your team already knows. A highly-rated yet unfamiliar database may hurt your project, while a lower-rated, more familiar tool may be enough to do the job.
If you decide to use any new database, make sure your team gets the training and guidance they need to implement it correctly.
SQL is usually a good choice and a strong all-arounder for most projects. For more specialized work though, a NoSQL database may be the better choice. For example, Redis has become a popular choice for caching. And if you’re looking for a fast and scalable database and have no problems sacrificing some robustness, MongoDB may be just what you need.
Avoid pursuing the latest and greatest just for the sake of newness. Programmers may like the idea of new tech, but what’s hot today may be discontinued five years from now. It’s challenging to find people or support for a discontinued product, and replacing a database mid-project is likely a costly and time-consuming task.
Ultimately, the answer to what database you should use for your next project is: it depends. Luckily, with modern architecture, the choice between SQL and NoSQL isn’t an either-or option. They can exist side-by-side in the same application landscape.
Conclusion
Both SQL and NoSQL are useful for software development teams. NoSQL databases can incorporate SQL elements, while SQL databases can offer some of the benefits of NoSQL through new features and full-fledged NewSQL databases. When choosing your database, consider your needs and what makes the most sense for your team, SQL, NoSQL, or even NewSQL.
Once you’ve chosen the right database, it’s crucial to integrate it effectively into your development workflow. This is where continuous integration and continuous deployment (CI/CD) come into play. CI/CD ensures that your database changes are automatically tested and deployed, leading to more reliable and efficient development processes.
You can sign up for a free plan and start experimenting with CI/CD to streamline your database management and improve your team’s productivity.