Created in the early 1970s, SQL was the unrivaled choice for applications both large and small that were in need of storing, managing, and retrieving data. Deciding to implement a SQL-driven database was a no-brainer for developers and system architects because it was unmatched by its alternatives not only in reliability but also in its reporting capabilities. In recent years, however, SQL, the once unrivaled query language, is being replaced slowly but surely by the new kid on the block: NoSQL.
Let’s take a moment to evaluate and discuss the differences and similarities of these database models with a few areas of focus. For the sake of this discussion, we will use MongoDB as the NoSQL example and MySQL for the SQL example, but the assertions for NoSQL are not limited or specific to Mongo but also apply to numerous other NoSQL campers like MapReduce, Bigtable, Cassandra, and others. Many of these NoSQL technologies have similar capabilities and advantages, as well as weaknesses. Similarly, on the SQL/RDBMS side, we will reference MySQL, but the same ideas carry over for other solutions like Oracle, SQL Server, and others.
The Back Story
We first need a bit of backstory. We’ll begin by asking ourselves what SQL vs. NoSQL really means. To answer this we have to understand what a database and a database system are. Now, bear with me here, without trying to bore you to death, we may agree that a database is a system for storing and managing data and a database system is a program for managing databases. Got that? So next, the data in a database has to be organized in some way. The way in which various database systems choose to organize their data is referred to as a database model, and this is where SQL and NoSQL come in. On the surface, think of SQL as the relational model and NoSQL as the non-relational model. Diving deeper, we find that the relational model uses set theory and predicate logic, in which the data looks like it is organized in tables and columns, while the NoSQL model stores data as key-value pairs without any strict relation to each other. End of backstory; you’ve uncovered the plot, so now let’s jump straight to the three-round fight scene.
Data scientists and architects assert that SQL is not out for the count, and I agree. Relational databases have a huge advantage compared to these wannabe newcomers primarily because they have excellent tooling, community, and support. The “why change what’s working?” question comes up pretty often when developers and system architects are faced with the decision of rolling out a data management tool. SQL offers two main advantages: first, it introduces the concept of accessing many records with one single command; and second, it eliminates the need to specify how to reach a record; e.g., with or without an index. Since most NoSQL databases lack the ability for joins in queries, the database schema generally needs to be designed differently and lends itself to some inefficiencies, giving SQL round one of this match by a small margin.
Popular NoSQL databases were pioneered by top Internet companies like Amazon, Google, LinkedIn, and Facebook to overcome the drawbacks of the relational model. The relational model is not always the best solution for all situations as it cannot meet the increasing growth of unstructured data.
You may have paused to ask yourself why some organizations have unstructured data. Well, social media posts and multimedia are only two examples of unstructured data, which should answer that question, but more importantly, since the world’s data is doubling every two years, companies now have a much greater need to not only store and accommodate unstructured data, but also to aggregate and use this data meaningfully. This is where the relational model takes a hit to the face giving round two to NoSQL.
Round three has three main focus; performance, planning, and price. Ideally, this is where the rubber meets the road in this decision. With NoSQL, instead of retrieving all the data
with one query, it is common to do several queries to get the desired data, but NoSQL queries are often faster than relational SQL queries so the cost of having to do additional queries may be acceptable. So if an excessive number of queries would be necessary and the company’s culture values performance, SQL or NoSQL may take the performance point for this round.
Planning is the second judge and it is very important, yet subjective. SQL databases have predefined schemas while NoSQL databases use a dynamic schema due to the nature of unstructured data. Therefore, SQL databases can be scaled vertically whereas NoSQL databases can be scaled horizontally. This means that to scale an SQL database, we simply increase the processing power of the hardware, while to scale a NoSQL database, we increase database servers in the pool of resources to reduce the load. Based on the organization’s planning culture, either SQL or NoSQL can take the planning point and this leave price as the final factor.
All companies make an effort to keep cost low so price an important factor. Most, if not all NoSQL databases are open source or very very low cost, which makes them very appealing. For the example, MongoDB, Couchbase, CloudDB, and Amazon’s Dynamo DB all allow very affordable implementations. On the other hand, among the major players in the SQL space, only MySQL (and its fork MariaDB) offers an open source solution. Both Oracle and SQL Server can be very expensive solutions due to the support they offer, which is usually readily available from their vendors or a fair number of independent consultants. NoSQL databases rely mainly on community support.
The Judge’s Decision
Depending on your vantage point or who you ask, some may agree that this match could have ended in a technical draw a few paragraphs prior, where unstructured data was a definite need and nothing else mattered. Another vantage point may have revealed that this match has ended in a TKO if no budget was allocated to database implementations and management couldn’t care less about database models, giving priority to an open-source-only solution. Ultimately, you decide who wins round three and therefore the match.
Despite your vantage point or decision, we all can agree that SQL and NoSQL serve very different purposes, with their own unique strengths and weaknesses; without them, developers and system architects would have been pulling their hair out trying to figure out how to utilize data locked away in filing cabinets to generate predictive analysis and graphs.
Learn more about what our developers choose as their everyday tools and what our thoughts on these are: