You are on page 1of 6

2016 IEEE Bombay Section Symposium (IBSS)

Comparison of SQL, NoSQL and NewSQL Databases


for Internet of Things
Haleemunnisa Fatima Prof. Kumud Wasnik
Computer Science and Technology Computer Science and Technology
UMIT, SNDT University UMIT, SNDT University
Mumbai, India. Mumbai, India.
fatima035@gmail.com kumudwasnik@gmail.com

Abstract— The Internet of Things (IoT) in all its essentiality is With the advancement in technology the use of smart
a collection of sensors bridged together tightly. The present day applications has increased drastically. Internet of Things, IoT,
development in the technology industry has thus elevated the touches every facet of our life through various sensor
emphasis of large amounts of data. IoT is expected to generate empowered devices such as smart homes, smart tracking,
and collect an enormous amount of data from varied locations
smart watches, wearables and et al. The term Internet of
very quickly. The concerns for features like storage and
performance factors are coercing the databases to outperform Things (IoT), also known as Internet of Objects refers to the
themselves. Traditional relational databases have time and again networked interconnection of everyday objects, which is
proved to be efficient. NoSQL and more recently NewSQL generally viewed as a self-configuring wireless network of
databases having gained an impetus are asserted to perform sensors whose purpose would be to interconnect all things [4].
better than the classic SQL counterpart. This paper compares the The IoT consists of an enormous ever changing and increasing
performance of three databases each from the said technologies; network, involving billions of objects. These objects both
SQL (MySQL), NoSQL (MongoDB) and NewSQL (VoltDB) generate data and communicate with each other at the same
respectively for sensor readings. The sensor data handled ranges time. The data thus generated is in tremendous volume that
in vast (MB to GB) size and tested against: single write, single
eventually comes to the network at real-time.
read, single delete and multi write operations.
Due to these attributes of the said data, processing,
Keywords— Internet of Things; databases; SQL; NoSQL; managing and storing becomes a difficult task. Thus, a system
NewSQL; data; performance that can overcome the challenge of accumulating and handling
such an ample amount of data is needed. For this reason, it is
I. INTRODUCTION imperative to evaluate the different DBMS systems with
Database Management System is a comprehensive respect to their performance. RDBMS or SQL have been used
terminology that refers to a software tool that collaborates to store and recover immense amount of data since their
users and applications and lets the user store, alter and recover inception. They have a fixed schema and store data in the
the stored data. There are three classes of database systems traditional row and column format in a table. The databases
namely, Relational Database Systems (RDBMS) based on the adhering to this class handle the concurrency in transactions
relational model by E.F.Codd [1], primarily known as the by the way of conservation of ACID properties, where a
traditional system. These database systems are also known as transaction must either be complete or not at all. Although this
SQL systems chiefly as they adopt SQL as their querying model has its set of setbacks it is an efficient and a reliable
language. With the arrival of internet and a plethora of standard model.
applications connected to the internet increased, the amount of Owing to these characteristics of SQL databases a
data generated too increased drastically. To handle this new class of databases, namely, NoSQL emerged. NoSQL
massive amount of data effectively two new classes of databases otherwise known as Not-only-SQL databases belong
databases were born, non-relational and more recently new- to the non-relational model. These databases are schema free
relational; popularly known as NoSQL and NewSQL and have a flexible schema design that can handle a variety of
databases respectively. data. They are scalable and provide high availability thereby,
In the last couple of years there has been a being more favourable to store Big Data and IoT data. The
remarkable rise in the amount of data that is being generated. NewSQL or the modern relational model on the other hand, is
Big data is a term, used to describe this enormous quantity of designed in a way that it retains the relational aspect of the
data, which is structured, semi-structured and unstructured. traditional RDBMS while at the same time incorporating
According to the Gartner group, Big Data can be defined by solutions provided by NoSQL databases. They are claimed to
3Vs: volume, velocity and variety [2]. Processing such vast be the most promising database management system as well as
amounts of data requires speed, flexible schemas and far more adept at handling data than other databases for the
distributed databases [3]. ever increasing IoT data.

978-1-5090-2730-9/16/$31.00 ©2016 IEEE


2016 IEEE Bombay Section Symposium (IBSS)

This paper investigates different types of databases. J.S van der Veen et al. [10] compared SQL and NoSQL
The focus is to evaluate and compare NewSQL and NoSQL category of databases, namely PostgreSQL, MongoDB and
databases against traditional relational SQL database, in order Cassandra for storing sensor data. The tests were run in
to point out their differences in performance. Along with that, contrast with physical server and a virtual machine. The
the paper distinguishes the typical types of IoT data, and uses outcome of this research yielded no definite winner as
the sensor data that is most commonly found. Tests have been PostgreSQL performed better in case of reads while
performed for three popular databases: MySQL belonging to MongoDB in case of writes. Another research by Phan Thi
SQL, MongoDB representing the NoSQL category and Anh Mai et al. [11] comparing MySQL versus MongoDB[12],
VoltDB representing NewSQL. All the databases were loaded CouchDB[13] and Redis[14] as a storage for both sensor and
on physical machine server. multimedia data; the tests were carried out with emphasis on
The rest of the paper is organized as follows. Section MySQL verses MongoDB. The research focused on the
II focuses on related work in the database performance with performance of the said databases in the cloud environment.
respect to IoT data. Section III covers the proposed system The final outcome was clear for CouchDB and Redis who did
architecture. Section IV covers experimental setup. Section V not appear to be the best choice when IoT data is taken into
covers analysis and results. Section VI eventually draws a consideration while with respect to MongoDB and MySQL
conclusion and mentions future research. the results followed the pattern similar to J.S van der
Veen[10], with no definite winner.
II. RELATED WORK
A. SQL vs NoSQL vs NewSQL In this paper, we too take into accord the same type of data
as in [7] but we include a new class of database, NewSQL, to
This section reviews the existing data models and related
surveys done in the domain of database management systems. store the sensor data and test the performance of the said
Broadly there are three main data models viz. the Traditional databases.
or Relational model, The Non-relational model and Modern III. PROPOSED SYSTEM AND ARCHITECTURE
Relational model.
A. Architectural Overview
In the classical RDBMS model, the data is organized in the
form of relations and is represented in a table consisting of The system implemented for the tests simulated a sensor
rows and columns. Relational databases employ the usage of a network; the architecture is shown in Figure 1. The local
parameter known as key. There are several types of keys network comprises of one central database server located on a
available albeit primary key is one of the most important key physical server machine connected to multiple simulating
of the table; it is used to identify each row of the table devices or nodes. Each simulating device then directly collects
uniquely. There are four main operations used to access the the sensor data, sound in our case and sends it over the
database they are known as CRUD namely, Create, Read, network to the database server. Multiple database clients can
Update and Delete associated with the data. These operations upload and read these data from the server. In the
use the Structured Query Language –SQL. ACID properties implementation, we have used a microphone sensor to detect
are one of the most significant and important attributes of a the incoming sound and send it to the database.
SQL database. This is the key difference between SQL and In proceedings, the writing thread is meant to run
NoSQL database systems. The NewSQL approach on the continuously without disconnection. However, in the tests, we
other hand, conserves and supports the properties of relational only measured the time taken to execute a particular number
model, at the same time incorporating the features of NoSQL
of writes and on a local wireless network. The system is
model.
expected to serve clients in real time which means that once
There are several database contributions that offer viable the data is generated, it sends the data immediately to the
solutions and adaptable data models for both existing and database and ready for clients to query.
future applications depending on what results are to be yielded
based on the DBMS system used [5]. Numerous survey papers
have thus been published (Hecht, R., & Jablonski, S. [6] or
Tudorica, B. G., & Bucur, C. [7] or Moniruzzaman, A. B. M.
[8]) to remark and give feedback in response to the queries
that arise.
B. Internet of Things
The term Internet of Things (IoT), also known as Internet
of Objects refers to the networked interconnection of everyday
objects, which is generally viewed as a self-configuring
wireless network of sensors whose purpose would be to
interconnect all things [4]. Internet-of-Things, IoT, is an
application domain that integrates different technological and
social fields [9].
Figure 1: Proposed System

978-1-5090-2730-9/16/$31.00 ©2016 IEEE


2016 IEEE Bombay Section Symposium (IBSS)

B. Problem Statement they communicated to the server machine running the


The most pressing question is handling data, more database in LAN via wireless network connection, thus
precisely Internet of Things data in an efficient manner. As simulating an IoT system. Threads in java were used to
replicate multiple clients. The device can perform basic
mentioned above there are an array of databases currently
operations on the databases by gathering the sensor data and
available, falling under the SQL, NoSQL and NewSQL
storing it in the database. The tests were to run these
database systems. However, which database provides the best operations with different workload.
viable solution and is equipped to handle IoT data still looms
at large. As far as our research goes, not much research has The tests were to evaluate the performance of the basic
been done to include NewSQL databases when in comparison read, write and delete operations. Both single client and multi
far too much research has been done and is ongoing with client. In order to increase reliability, each test of the same
respect to NoSQL databases. Hence, the paper addresses this input were run multiple times (at least 5 times). The final
issue and speculates the results for best performance for a result for a test was then the average of these individual runs.
large amount of IoT data. A. Sound Meter
Sensors take measurements of their surroundings at
The focus of this paper is to investigate and compare the
certain interval of time. Each measured value of the sensors is
three database technologies, with respect to performance as
the amount of data increases. The type of IoT data used is then transmitted from the client via a network to the database
sensor data. The three popular databases considered are stored in the server that is in the same network as that of the
MySQL, MongoDB and VoltDB respectively. client. The sensed data is the minimum amount of information
a general sensor data system needs.
C. Data Structures The sensor used for this research is a Sound Meter
The common data structure for all records is shown that collects the sound it senses (in dB) via the Microphone
in Table I. When storing this data type in different databases, sensor device embedded in a smartphone. Thus, the
there was a slight difference in the database storage structures. smartphone acts the IoT device used to collect data and send it
They were grouped as across the server to the database.

Table I: Data Structure V. ANALYSIS AND RESULTS


Name Type In this section, the results of all the tests taken as well as
sound varchar evaluation for the same are presented. Based on the operation
latitude varchar performed on the incoming sensor data, the results are divided
longitude varchar into four different categories:
x A single client with single read, write and delete
D. Operating Environment
operation.
Table II lists the hardware and software that was used for x A single client with multi write operation.
the tests. In addition, Apache2 server was deployed on the x A multi client with single read and write operation.
physical machines running the databases. The machines were x A multi client with multi write operation.
connected via wireless network.
Since for MySQL and MongoDB simple insert is
Table II: Hardware and Software
employed as a DDL schema the same with VoltDB [15] is
Parameters Server Client
considered. Although tests for VoltDB using stored procedure
OS 64-bit Ubuntu 14.04 64-bit Ubuntu 14.04
were also considered and performed, they are not included in
Processor Intel Core i7-3770 Intel Core i5-2410M the graph analysis. Analysis proved that VoltDB outperforms
CPU @ 3.40 GHz × 8 CPU @ 2.30GHz × 4 when stored procedure was considered as compared to simple
MySQL 5.5.49 JDBC Connector insert.
Version 5.1.39
A. Single Client Single Operations
MongoDB 3.2.7 Java Driver 3.2.0
VoltDB 6.3 Java Driver 6.3 In single client single write operation depicted in
JDK 1.8.0 1.8.0 Figure 2, there is a vast difference between MySQL with
PHP 5.5 5.5 respect to MongoDB and VoltDB. In comparison VoltDB
performs very well, MongoDB performs moderately while
MySQL performs very poorly.
IV. EXPERIMENTAL SETUP
The main goal of the tests was to compare the performance
of different databases as physical databases. Hence, the
database servers were placed on the physical machines. An
IoT device was implemented to play the role of the database
clients. Although for huge amount of data, client machines
running a Java program were used to generate the data, and

978-1-5090-2730-9/16/$31.00 ©2016 IEEE


2016 IEEE Bombay Section Symposium (IBSS)

B. Single Client Multi Operations


Figure 5 shows the performance of a single client issuing
multiple write requests to the database. VoltDB is not used in

Figure 2: Single Client Single Write

Figure 5: Single Client Multi Write


this comparison as it uses a partitioned table concept where
the data is filled very quickly, although you can batch multiple
SQL statements together in a stored procedure by calling the
predefined function voltQueueSQL() it can only take 1000 at a
time, hence it is excluded in our performance metric.
MongoDB is a clear winner in this one followed by MySQL.
Contrary to single insert, MySQL is fast and efficient when
writing bulk records in the database thereby increasing the
performance considerably.

Figure 3: Single Client Single Read C. Multi Client Single Operations


Figure 6 shows the performance of multiple clients
issuing single write requests to the database. The databases are
very fast when a single client is used, but decrease slowly
when clients are added. Initially, the performance of
MongoDB and VoltDB varies slightly but as the number of
records with respect to clients increases, the performance
varies considerably. In the complete outcome, VoltDB
outshines MongoDB marginally if the number of records are
increased significantly. MySQL yet again falls out of the
league, with the performance being very poor.

Figure 4: Single Client Single Delete

Figure 3 shows the performance of a single client single read


operation request to the database. As with write operation,
VoltDB performs exceptionally well again followed closely by
MySQL whereas MongoDB lags behind. Although the
difference between the three databases is far less as compared
to write operations.
Figure 4 shows the performance of a single client
single delete operation request to the database. The order is the Figure 6: Multi Client Single Write
same as with read, although delete operation difference is
much less compared to reading. Figure 7 shows the performance of multiple
clients issuing single read requests to the database. All the
databases benefit when using multi client for read operations.
In the first certain set of records, MySQL outperforms VoltDB

978-1-5090-2730-9/16/$31.00 ©2016 IEEE


2016 IEEE Bombay Section Symposium (IBSS)

slightly but VoltDB steadily gains a better hand as the number retain SQL and ACID properties at the same time include
of records increases. During the start MongoDB lags behind performance and scalability through the modern architecture.
considerably as opposed to the other two databases. Inspite of Sensor data was chosen to be evaluated while
performing brilliantly at the onset the performance of MySQL analyzing the performance. For sensor data, in write intensive
decreases suddenly with the increase of records, irrespective operations NewSQL database VoltDB showed good results
of the clients. Hence, for more number of records MySQL or with respect to the write performance. The NoSQL counterpart
VoltDB either can give similar performance but for large scale MongoDB came followed closely while relational MySQL
performance VoltDB followed by MongoDB is better. vastly lagged behind. In read intensive systems, VoltDB
performs exceptionally well followed by MySQL whereas
MongoDB comes last. But as the size of the data and the
number of clients are increased there is a considerable fall in
the performance of MySQL and progress in MongoDB. The
outcome of delete query in terms of performance is round
about same as that of read intensive operations.
Although the outcome clearly suggests that
VoltDB is faster and a clear winner in many cases, MongoDB
is in no means behind, as the performance can further be
improved by taking more advantage of the schema-less and
flexible data model.
In conclusion, as of the current status of all the
databases, VoltDB is a clear winner for IoT data. Every
Figure 7: Multi Client Single Read
system has its own pros and cons with performance being of
D. Multi Client Multi Operations supreme priority. VoltDB though not purely OLAP oriented
Figure 8 shows the performance of multiple clients issuing has consistent performance compared to the other two
multiple write requests to the database. VoltDB as mentioned considered. Lastly, which database to choose however, highly
above is not considered in bulk operations. Both MongoDB depends on the properties and requirements of the specific
and MySQL do not benefit from multiple clients when issuing system.
bulk insert. The time taken for the number of operations stays There is further room for research in the future,
roughly the same irrespective of more number of records or one of the most pressing being analyzing the databases with
clients. At the onset, MongoDB performs well but later on the multimedia data and other complicated types of IoT data
falls in step with MySQL. considering the growing demand of IoT, which was not
possible in this research project due to certain constraints.
REFERENCES
[1] Codd, Edgar F. "A relational model of data for large shared data banks."
Communications of the ACM 13.6 (1970): 377-387.
[2] Beyer, Mark A., and Douglas Laney. "The importance of ‘big data’: a
definition." Stamford, CT: Gartner (2012).
[3] Li, Yishan, and Sathiamoorthy Manoharan. "A performance comparison
of sql and nosql databases." Communications, Computers and Signal
Processing (PACRIM), 2013 IEEE Pacific Rim Conference on. IEEE,
2013.
[4] Conner, Margery (May 27 2010). Sensors empower the "Internet of
Things" pp. 32–38. ISSN 0012-7515
[5] Haleemunnisa Fatima, Kumud Wasnik."Comparision of SQL, NoSQL
and NewSQL Databases in light of Internet of Things - A Survey"
Figure 8: Multi Client Multi Write IJAECS, Vol 3, Issue-3,2016, pp.31-34.
[6] Hecht, Robin, and Stefan Jablonski. "NoSQL evaluation: A use case
VI. CONCLUSION AND FUTURE WORK oriented survey." (2011): 336-341.
The purpose of this research project was to explore how [7] Tudorica, Bogdan George, and Cristian Bucur. "A comparison between
different database systems can handle diverse and large several NoSQL databases with comments and notes." Roedunet
International Conference (RoEduNet), 2011 10th. IEEE, 2011.
amount of data of the Internet of Things effectively in terms of
[8] Moniruzzaman, A. B. M. "NewSQL: Towards Next-Generation Scalable
performance with increasing load. Three classes of databases RDBMS for Online Transaction Processing (OLTP) for Big Data
were studied, namely, SQL, NoSQL and NewSQL databases. Management." arXiv preprint arXiv:1411.7343 (2014).
SQL databases are relational and focus on the ACID [9] Minerva, Roberto, and Abiy Biru. "Towards a Definition of the Internet
properties. NoSQL databases on the other hand are schema- of Things." IEEE IoT Initiative white paper.
less providing better performance and scalability and do not [10] J. S. van der Veen, B. van der Waaij, and R. J. Meijer, "Sensor data
adhere to the ACID properties. While NewSQL databases storage performance: Sql or nosql, physical or virtual." In Cloud
Computing (CLOUD), 2012 IEEE 5th International Conference on
(2012), IEEE, pp. 431-438.

978-1-5090-2730-9/16/$31.00 ©2016 IEEE


2016 IEEE Bombay Section Symposium (IBSS)

[11] Phan, Thi Anh Mai, Jukka K. Nurminen, and Mario Di Francesco. [13] Couchdb, a database for the web. http://couchdb.apache.org/
"Cloud Databases for Internet-of-Things Data." Internet of Things [14] Redis. http://redis.io/.
(iThings), 2014 IEEE International Conference on, and Green
Computing and Communications (GreenCom), IEEE and Cyber, [15] Stonebraker, Michael. "NewSQL: An Alternative to NoSQL and Old
Physical and Social Computing (CPSCom), IEEE. IEEE, 2014. SQL for New OLTP Apps." Communications of the ACM. Retrieved
(2012): 07-06.
[12] The mongodb manual. http://docs.mongodb.org/manual

978-1-5090-2730-9/16/$31.00 ©2016 IEEE

You might also like