SQL vs NoSQL, the ultimate database match

SQL vs NoSQL, the ultimate database match

In the blue corner, SQL stands with an arched back and titanium walker. In the red corner, NoSQL maintains steady movement and a toddler’s impatience.

Created in the early 1970s, SQL was the unrivaled choice for applications both large and small that were in need of storing, managing, and retrieving data. Deciding to implement a SQL-driven database was a no-brainer for developers and system architects because it was unmatched by its alternatives not only in reliability but also in its reporting capabilities. In recent years, however, SQL, the once unrivaled query language, is being replaced slowly but surely by the new kid on the block: NoSQL.

Let’s take a moment to evaluate and discuss the differences and similarities of these database models with a few areas of focus. For the sake of this discussion, we will use MongoDB as the NoSQL example and MySQL for the SQL example, but the assertions for NoSQL are not limited or specific to Mongo but also apply to numerous other NoSQL campers like MapReduce, Bigtable, Cassandra, and others. Many of these NoSQL technologies have similar capabilities and advantages, as well as weaknesses. Similarly, on the SQL/RDBMS side, we will reference MySQL, but the same ideas carry over for other solutions like Oracle, SQL Server, and others.

The Back Story
We first need a bit of backstory. We’ll begin by asking ourselves what SQL vs. NoSQL really means. To answer this we have to understand what a database and a database system are. Now, bear with me here, without trying to bore you to death, we may agree that a database is a system for storing and managing data and a database system is a program for managing databases. Got that? So next, the data in a database has to be organized in some way. The way in which various database systems choose to organize their data is referred to as a database model, and this is where SQL and NoSQL come in. On the surface, think of SQL as the relational model and NoSQL as the non-relational model. Diving deeper, we find that the relational model uses set theory and predicate logic, in which the data looks like it is organized in tables and columns, while the NoSQL model stores data as key-value pairs without any strict relation to each other. End of backstory; you’ve uncovered the plot, so now let’s jump straight to the three-round fight scene.

Ding!

Data scientists and architects assert that SQL is not out for the count, and I agree. Relational databases have a huge advantage compared to these wannabe newcomers primarily because they have excellent tooling, community, and support. The “why change what’s working?” question comes up pretty often when developers and system architects are faced with the decision of rolling out a data management tool. SQL offers two main advantages: first, it introduces the concept of accessing many records with one single command; and second, it eliminates the need to specify how to reach a record; e.g., with or without an index. Since most NoSQL databases lack the ability for joins in queries, the database schema generally needs to be designed differently and lends itself to some inefficiencies, giving SQL round one of this match by a small margin.

Popular NoSQL databases were pioneered by top Internet companies like Amazon, Google, LinkedIn, and Facebook to overcome the drawbacks of the relational model. The relational model is not always the best solution for all situations as it cannot meet the increasing growth of unstructured data.

NoSQL is better for unstructured data

You may have paused to ask yourself why some organizations have unstructured data. Well, social media posts and multimedia are only two examples of unstructured data, which should answer that question, but more importantly, since the world’s data is doubling every two years, companies now have a much greater need to not only store and accommodate unstructured data, but also to aggregate and use this data meaningfully. This is where the relational model takes a hit to the face giving round two to NoSQL.

Round three has three main focus; performance, planning, and price. Ideally, this is where the rubber meets the road in this decision. With NoSQL, instead of retrieving all the data

with one query, it is common to do several queries to get the desired data, but NoSQL queries are often faster than relational SQL queries so the cost of having to do additional queries may be acceptable. So if an excessive number of queries would be necessary and the company’s culture values performance, SQL or NoSQL may take the performance point for this round.

Planning is the second judge and it is very important, yet subjective. SQL databases have predefined schemas while NoSQL databases use a dynamic schema due to the nature of unstructured data. Therefore, SQL databases can be scaled vertically whereas NoSQL databases can be scaled horizontally. This means that to scale an SQL database, we simply increase the processing power of the hardware, while to scale a NoSQL database, we increase database servers in the pool of resources to reduce the load. Based on the organization’s planning culture, either SQL or NoSQL can take the planning point and this leave price as the final factor.

All companies make an effort to keep cost low so price an important factor. Most, if not all NoSQL databases are open source or very very low cost, which makes them very appealing. For the example, MongoDB, Couchbase, CloudDB, and Amazon’s Dynamo DB all allow very affordable implementations. On the other hand, among the major players in the SQL space, only MySQL (and its fork MariaDB) offers an open source solution. Both Oracle and SQL Server can be very expensive solutions due to the support they offer, which is usually readily available from their vendors or a fair number of independent consultants. NoSQL databases rely mainly on community support.

The Judge’s Decision
Depending on your vantage point or who you ask, some may agree that this match could have ended in a technical draw a few paragraphs prior, where unstructured data was a definite need and nothing else mattered. Another vantage point may have revealed that this match has ended in a TKO if no budget was allocated to database implementations and management couldn’t care less about database models, giving priority to an open-source-only solution. Ultimately, you decide who wins round three and therefore the match.

Despite your vantage point or decision, we all can agree that SQL and NoSQL serve very different purposes, with their own unique strengths and weaknesses; without them, developers and system architects would have been pulling their hair out trying to figure out how to utilize data locked away in filing cabinets to generate predictive analysis and graphs.

Ding!

 

Editor’s Note: 

Learn more about what our developers choose as their everyday tools and what our thoughts on these are:

Comparing React Native to Axway Titanium 

Kotlin: Three Reasons to Start Using it Today 

Node.js – Storing data with MongoDB

Node.js – Storing data with MongoDB

Node.js – Storing data with MongoDB

The last few weeks we have focused on web service design using Node.js with Express. So far we have covered getting started with express, using EJS to create web pages, and creating a push notification server. Today we will go over how to save data on your node project by using MongoDB.

This post will assume that you have MongoDB installed on your machine already. Check out the MongoDB Installation page if you do not have it setup.

First, let’s go over MongoDB a bit. MongoDB is a document oriented database. This allows you to just shove whatever JSON you want into it. You don’t have to declare a table structure like you do in SQL. You don’t even need to declare your collections at all. If you try to use a collection that doesn’t exist it just gets made for you immediately. Documents are stored in collections that you define, but documents within a collection don’t have to be similar at all. MongoDB also features an amazing query system. Imagine we want to store information about fruit. We could do the following:

We now have a fruit collection containing one fruit: an apple. Let’s add a few more:

Now we have a strawberry and an orange in our fruit collection. To get all fruits in our collection that come from trees we would do:

If we wanted to find all fruits that are red we would do:

And to find all fruits that are yellow or orange we would do:

Now apples can be green too. Let’s modify our apple entry to reflect that:

There are a few interesting things to be aware of when updating a document. By default, mongodb will only update the first entry it finds. You can set the multi option to force it to update every entry that matches the query. Also, note the $push operator I have included in the update object. That is telling mongo it will be adding an entry to an array on the document. If you left off push and just did:

You would be very disappointed. Setting no update operator will replace the document completely:

So what if you wanted to update a field that isn’t an array? You use the $set update operator:

MongoDB is very powerful and an excellent document store, but it lacks the join power that SQL brings to the table. You can join documents together by their _id field using MonoDB’s populate, but it’s not nearly as fast as what you can do with a few SQL joins. I encourage you to explore MongoDB as an option for your databases, but it’s important to approach your database design differently with MongoDB than with SQL. Be sure to check out the documentation for all you can do with MongoDB.

So how to we start using Mongo with Node? There are several libraries available on npm for connecting to Mongo, but I will be focusing on Mongoskin and Mongoose.

Let’s go ahead and get Mongoskin out of the way. Mongoskin is what you want if you just want to run basic MongoDB commands and nothing more. To get started add Mongoskin as a dependency:

And then in node:

Mongoskin is very easy to use and most MongoDB commands are supported. It has its issues, but if you want something simple to connect to a mongo database look no further. If you want the best MongoDB experience you can find with Node, keep reading!

On to Mongoose. As they say on their website: “Mongoose provides a straight-forward, schema-based solution to modeling your application data and includes built-in type casting, validation, query building, business logic hooks and more, out of the box.” Mongoose is a delight. To get started add Mongoose as a dependency:

It’s nice that MongoDB doesn’t need to have schemas, but it’s great that Mongoose gives them to you. Let’s put together a schema for our fruit collection:

Most of the schema above should be straight forward. Name must be a string or Mongoose will remove the property from the object before using it as a model. Color must be an array of strings. Source must be a string, and if it’s not present Mongoose will automatically set it to “tree”. Default values will not get stored in the actual database, but they will be applied when Mongoose pulls them out. The virtual “numberOfColors” will add a property automatically. Virtuals are great for properties that can be calculated based on the existing data on the model.

To use the fruits model:

The examples above barely scrape the surface of what you can do with MongoDB and Mongoose. Do yourself a favor and explore the possibilities of a document oriented database backed by MongoDB.