SQL (Structured Query Language) is designed for managing data held in a relational database management system (RDBMS). Its counterpart, NoSQL, refers to “non-SQL” for managing non-relational databases. NoSQL variants like MongoDB have evolved quickly as enterprises have begun to rely on a combination of structured and unstructured data for business intelligence.
As a database professional, can you make the transition to a NoSQL like MongoDB Query Language (MQL)? Naturally. Although the syntax and semantics may vary greatly, querying with NoSQL is based on the same concepts as querying with SQL.
In this post, I’ll cover some of the basics of working with MongoDB Query Language.
Structured and unstructured data examples
The difference between structured and unstructured data is a useful point of departure. Structured data follows a well-defined model or schema. Unstructured data is not organized according to any pre-defined model. Think of structured data in quantitative terms and think of unstructured data in qualitative terms.
The elements in a stream of structured data fit neatly into well-defined fields or columns; for example:
- Name
- Address
- Account Number
- Order Amount
- Price
- Inventory Count
You could have those labels in the top row of a spreadsheet and know that all of the entries in each column would conform to the labels. You can query and analyze the data easily because the stream is so well defined.
In a stream of unstructured data, on the other hand, the elements do not fit neatly into fields or columns. Examples of unstructured data include these:
- Video clips
- Mobile device activity
- Social media usage
- Textual documents
- Images
There is no definition for the first five seconds of any video clip or popular song, for instance. That’s why you could never be sure what you would see and hear if you used traditional tools to query for it. Similarly, there is no defined structure for social media feeds, where people write freely and use unpredictable language. That makes it difficult to use ordinary tools to search for favorable and unfavorable product reviews.
The origin of MongoDB Query Language
The creators of MongoDB developed a data storage engine with a dynamic, scalable and elastic platform suited to unstructured data. In 2009 they released the NoSQL product under the open-source license agreement together with commercial enterprise licenses.
MongoDB is a JSON-oriented (JavaScript Object Notation) document database, although internally it uses a binary-encoded variant of JSON called BSON. BSON is a serialized encoding format for JSON used mainly for storing and accessing documents. Compared to JSON, the BSON format offers lower parse overhead and richer support for data types such as dates and binary data.
Relational databases including SQL Server, Oracle and MySQL use SQL to access data. MongoDB developers use MongoDB Query Language.
SQL vs. MongoDB
The table below lists the main differences and similarities between SQL and MongoDB.
SQL | MongoDB |
---|---|
Relational database (SQL) | Non-relational database (NoSQL) |
SQL is for relational database management systems. The data store is in a structured format. | NoSQL is a document database in which the data is represented in the document. The database stores and retrieves documents, which can be XML, JSON, BSON and so on. These documents are self-describing, hierarchical, tree-data structures that consist of maps, collections and scalar values. The documents stored are similar to one another but not the same. |
For querying structured data | For querying unstructured data |
SQL (Structured Query Language) | MQL (MongoDB Query Language) |
Table | Collection |
Row or tuple | BSON or other documents |
Column (attribute) | Field |
Index | Index |
Primary key can be made of a unique key or a combination of columns | Primary key is Key (_Id) field of document |
Aggregation process is a grouped (group by) transaction. | Aggregation is a pipeline transaction. The process is good for applications with low data processing needs; beyond that, it becomes more tedious and the aggregation framework becomes difficult to debug. |
SQL database relationships are defined across different tables using foreign and primary keys. It is simple to establish a JOIN between the tables. | MongoDB Query Language does not support establishing a JOIN as in SQL; however, this can be achieved using embedded documents. |
The later or newer versions support querying of JSON values. For example, SQL Server 2016 and Oracle 11g onward support JSON queries. | Built-in support for JSON |
Supports triggers | Does not support triggers |
The query transactions adhere to ACID properties. | The query transactions adhere to the CAP theorem. |
Data operations are handled and managed by SQL. | Data ingress and egress are simple in MongoDB. You can use Java, Python, JavaScript, etc., to construct, manipulate and parse the query expressions. In fact, no parser is necessary. |
SQL is fully developed and feature-rich. | MongoDB query is still limited, especially compared to the rich features of SQL. |
No support for horizontal scaling | Horizontal scaling using partitioning key |
MongoDB Query Language (MQL)
MongoDB queries are based on JavaScript. The language is reasonably easy to learn and many tools are available to query MongoDB data using SQL syntax.
When querying data, you have an extraordinary range of options, operators, expressions and filters. The documentation compares MongoDB Query Language and SQL syntax for common database operations. The following table lists many commonly used commands and options you can run in the MongoDB shell to manipulate and query data.
Here are some code examples for queries, with MongoDB Query Language and SQL equivalents where applicable.
Type | Description | MQL | SQL |
---|---|---|---|
Field selection | The find () function in MongoDB Query Language provides the easiest way to retrieve data from multiple documents within one of your collections. This function is one that you will be using often. | >db.Employee.find() |
>select * from Employee; |
Condition clause: Where clause | Here, to return all the A-status employees | >db.Employee.find ( { status : "A" } ) |
>select * from Employee where status=’A’ |
Projection: Select specific fields | The insertion of the tag { NAME : 1 } specifies that only the information from the NAME field should be returned. The results are sorted and presented in ascending order. | >db.Employee.find ( {status : "A"}, {NAME: 1} ); |
>select name from Employee where status=’A’; |
Field exclusion | To query for the opposite, inserting { NAME : 0 } retrieves a list of all fields except for the NAME field. | >>db.Employee.find ( {status : "A"}, {NAME: 0} ); |
>select age, address, status from Employee; |
Complex operation (Dot notation) | When you work with more complex document structures such as documents containing arrays or embedded objects, you can use other methods for querying from them. | >db.Employee.find( { "Address.City" : "Irving" }).pretty() |
>SELECT E.* from Employee E inner join Dept D on E.EMPID=D.EMPID Where D.City=’Irving’ |
Insert multiple documents | To create an array of documents, define the variable by a name and assign the array of documents. | >>document = [ { "Name" : "Brian Lockwood","Age" : "45", status:"A"}, { "Name" : "Charles","Age" : "35", status:"A"}] >db.Employee.insert (document) |
>Insert into Employee values (‘Brian Lockwood’,45, ‘A’),(‘Charles’,35,’A’) |
Insert nested document | Here, the address document is embedded in the document. MongoDB Query Language accommodates that, but SQL strictly follows procedural constructs and does not allow insertion of values to non-existing fields. | >>document = ({ "Name" : "Robert Jordan","Age" : "37", status:"A", Address: { Street:"Polaris Way", City : "Aliso Viejo",State:"California" }}) >db.Employee.insert (document) |
Not applicable |
Get distinct status | To return unique values only | >>db.Employee.distinct( "status"); |
Select distinct status from Employee; |
Sort | This example sorts the results based on the Age key in ascending order. Sorting is in ascending order unless otherwise specified (-1 flag for descending order). | >>db.Employee.find().sort( { Age: 1 }) |
>Select * from Employee order by Age ASC; |
Limit | Use the limit () function in MongoDB Query Language to specify the maximum desired number of results. | >db.Employee.find().limit(5); |
>select top(5) * from Employee; |
Skip | To skip the first n documents in a collection. Here, n = 5. | >>db.Employee.skip().limit(5); |
>SELECT * FROM Employee OFFSET 5 ROWS; |
Aggregate | count () returns the number of documents in the specified collection. | >>db.Employee.find().count(); |
>Select count(*) from Employee; |
Group | group() takes three parameters: key, initial, and reduce. The purpose of group () and SQL’s GROUP BY is to return an array of grouped items. | >db.Employee.aggregate([ {"$group" : {_id:"$status", count:{$sum:1}}} ]) |
>Select status,count(*) from Employee Group by status |
Comparison | Use special parameters $gt, $lt, $gte and $lte to perform greater-than and less-than comparisons in queries. | >>db.Employee.find ( { Age: {$gt : "30"} }); |
>Select * from Employee where Age>30 |
Multiple expressions in document | Use $or to search for multiple expressions in a single query. | >db.Employee.find({ $or : [ { "Name" : "Flynn" }, { "status" :"A"}]}) |
>select * from Employee where name=’Flynn’ or Status=’A’ |
Add fields to the document | Use $addToSet to add data to an array. | >db.Employee.update( { "Name" : "Charles" }, {$addToSet : { Address:"South Texas" } } ); |
Not applicable |
Sample Data
Below is sample output in MongoDB Query Language.
> db.Employee.find().pretty().limit(5)
{
"_id" : ObjectId("5e1503646c7b5d31e73b40c4"), "name" : "Ahmed", "age" : 1, "status" : "B"
}
{
"_id" : ObjectId("5e1503646c7b5d31e73b40c5"), "name" : "Bikram", "age" : 10, "status" : "C"
}
{
"_id" : ObjectId("5e1503646c7b5d31e73b40c6"), "name" : "Carlos", "age" : 33, "status" : "A"
}
{
"_id" : ObjectId("5e1503646c7b5d31e73b40c7"), "name" : "Davida", "age" : 32, "status" : "E"
}
{
"_id" : ObjectId("5f287f4007ab6ba4d8c601be"), "name" : "Etty", "age" : "35", "status" : "A"
}
>
Do MongoDB queries run faster than RDBMS queries?
NoSQL and RDBMS are designed to support different application requirements. Many enterprises use them side by side.
In the document model of MongoDB, related data is stored together. Whereas an RDBMS would require a complex JOIN to retrieve data across multiple tables, it is often faster to retrieve the data in a single document. Nevertheless, overall database performance — MongoDB or otherwise — depends on factors as diverse as database design, data velocity, transaction type, data volumes, query patterns and load time.