You are on page 1of 8

ScalabilityBenchmarking:

MongoDBandNoSQL
Systems

Overview
Inrecentyearsalternativestorelationaldatabaseshaveemergedthatprovideadvantagesin
termsofperformance,scalability,andsuitabilityforcloudenvironments.Whiletheyvary
significantlyintermsofcapabilities,manyintheindustryhaveadoptedthetermNoSQLto
describetheseproducts.Inthispaperweevaluatethescalabilityofthethreeleadingproducts
Cassandra,Couchbase,andMongoDBusinganindustrystandardbenchmarkcreatedby
Yahoo!calledYCSB.

Selectingtheappropriatedatabasetechnologyforaprojectinvolvescarefulconsiderationof
manydifferentcriteria.Whileperformanceisimportant,itmustbeconsideredalongwith
functionality,operationaltooling,easeofuse,availabilityofskills,technologyecosystem,
securitycontrols,reliability,andotherfactors.Ourgoalinthispaperistotakeacloserlookat
oneofthesecriteriascalabilityandweencouragereaderstouseourfindingsasoneof
manyinputstotheirownevaluations.

Methodology
Thethreetechnologiesweevaluateinthisreportprovideverydifferentfeaturesets.
Comparisonsaredifficult,buteachproductprovidesasimple,coresetofcapabilitiesthatcan
beusedasthebasisofascalabilityevaluation.

In2010,Yahoo!publishedabenchmarkcalledtheYahoo!CloudServingBenchmark(YCSB).
ThisbenchmarktestsCRUDoperations.ManyorganizationshaveusedYCSBtoevaluate
databasesanddifferenthardwareenvironments.Ithasbecomepopularandwellunderstood.

Weincorporatedenhancementsprovidedbyeachvendorfortheirclientintoourteststo
preventtheforkofanysinglevendorfromdisadvantagingtheotherproducts.Wealsoused
thelatestversionofeachdatabaseanddriverstoensurethatthebestpossibleperformance
foreachproductwasobserved.

YCSBisaframeworkandcommonsetofworkloadsforevaluatingtheperformanceof
differentdatabases.YCSBconsistsof:
TheClientanextensibleworkloadgenerator
TheCoreWorkloadsasetofworkloadscenariostobeexecutedbythegenerator

ThefollowingdiagramillustrateshowYCSBworkswithadatabase:

WeusedYCSBasthebasisforallourtests.Weusedthesamenumberofrecords,record
size,numberoffields,andnumberofoperationsinallofourteststoprovideacommon
baseline.Foreachofthetests,weperformedmultipleindependentruns,increasingthe
numberofclientthreadsuntilthemaximumthroughputwasreachedwithoutsacrificing
latency.Forallthreeproductstheidealnumberofthreadswasaround150forworkloadA,
andaround350forworkloadB.Formeasuringresults,werecordedthroughputandlatencies
forreadsandupdates(average,95thpercentile,and99thpercentile).

OursetupconsistedofadatabaseclusterofthreeserversandonededicatedYCSBserverto
ensuretheYCSBclientwasnotcompetingwiththedatabaseforresources.Allserverswere
identical.

Weperformedseparaterunsforeachofthethreeconfigurationswetested,eachoftheYCSB
workloads(WorkloadA,WorkloadB),andeachofthethreadcountswetested.

Foreachdatabaseweperformedthefollowingtaskstoloadthedata:
Startwithanemptydatabaseinstance(datafileswiped,serverrestarted)
Load400MrecordsusingtheloadphaseofYCSB
Allowthedatabasetostabilizeaftertheload,includingcompaction

Wethenraneachworkloadandrecordedtheresults.Beforestartingthenextworkloadwe
allowedthedatabasetostabilize.
2

Deployment Architecture
Allthreedatabasesimplementdistributedarchitectures.Whiletheirdesignsfollowvery
differentapproaches,allthreedatabasesprovideautomaticpartitioningofdatatosupport
scaleout,andallthreedatabasesrelyonasynchronousreplicationtomaintainmultiple
copiesofthedataforhighavailability.Furthermore,allthreedatabasesrecommend
multiserverenvironmentsforproductiondeployments.

Inthesetestswedeployedthedatabasesacrossthreebaremetalservers.Datawas
partitionedacrossallthreeservers,andeachdatabasewasconfiguredtomaintaintwocopies
ofthedata.

Test Results
WeevaluatedtwoworkloadsusingYCSB:WorkloadA,whichconsistsofequalnumbersof
readsandupdates,andWorkloadB,whichconsistsof95%readsand5%updates.Alltests
wereperformedwith400Mrecords.Becauseeachsystemisalsomaintainingtwocopiesof
thedata,eachserverismanaging~267Mrecords,whichrepresentsadatasetlargerthan
RAM.Eachtestperforms100Moperationsandrecordsthroughputandlatenciesatthe95th
and99thpercentilesforreadsandupdatesseparately.

Workload A (50% reads, 50% updates)

WhentestingadatasetlargerthanRAM,the50/50workloadinthesetestsdemonstratesthat
MongoDBprovidesover1.8xgreaterthroughputthanCassandra,andnearly13xgreater
throughputthanCouchbase.

MongoDBprovidessingledigitmillisecondlatencyatthe99thpercentileforreadsand
updates.Cassandraslatencyisslightlybetterforupdates,butworseforreads,whichistobe
expectedasitisawriteoptimizeddatabase.Weobservedsurprisinglyhighlatencyfor
Couchbase,whichwasseveraltimeshigherthaneitherCassandraorMongoDB:

YCSB(Latencies)WorkloadA

99th(Read)

99th(Update)

Cassandra

14ms

3ms

Couchbase

39ms

63ms

MongoDB

5ms

9ms

Workload B (95% reads, 5% updates)

WhentestingadatasetlargerthanRAM,thereadheavyworkload(95%reads)shows
MongoDBagainprovidesover1.75xthethroughputofCassandra,andover6xthe
throughputofCouchbase.

InthisworkloadthelatenciesforCassandraandMongoDBarehigherthanthe50/50
workload,andCouchbaseexhibitsthehighestlatenciesoverall:

YCSB(Latencies)WorkloadB

99th(Read)

99th(Update)

Cassandra

16ms

8ms

Couchbase

25ms

53ms

MongoDB

12ms

12ms

Conclusions
Basedonthesetests,wehavereachedthefollowingconclusions:

MongoDBprovidesgreaterperformancethanCassandraandCouchbaseinallthe
testsweran,byasmuchas13x.
CassandraandMongoDBexhibitverylowlatencyatthe99thpercentileforreadsand
updatesinbothworkloads.Couchbaseexhibitssignificantlyhigherlatency.
ThetestsshowthatindeploymentswheredataexceedsthesizeofRAM,wheredata
ispartitionedacrossmultipleservers,andwheredataisreplicatedforhighavailability,
MongoDBprovidesbetterperformancethanCassandraandCouchbase.
Insofarasthesetestsonlyevaluatethemostbasicoperationsofthesedatabases,
usersshouldalsoconsidertheotherfeaturesthatareimportanttotheirapplication,
andtheyshoulddevelopteststhatevaluatethosecapabilitiesaswell.

Environment

Thesetestswereconductedonbaremetalservers.Eachserverhadthefollowing
specification:
CPU:2x
3.0GHzIntelXeon(R)CPUE52690v2DecaCoreIvyBridge
RAM:96GB:6x16GBKingston16GBDDR32Rx4
Disk:(2)960GBSanDiskCloudSpeed1000SSD
drivecontroller:Adaptec71605
OS:Ubuntu14.10
Network:10gigE

Thefollowingconfigurationsweremadetooptimizetheenvironmentforeachdatabase:
readaheadwasreducedto64blocks
transparenthugepages,defrag,numaweredisabled

ThefollowingYCSBcodewasusedforthesetests:
https://github.com/usaindev/YCSB

Versionsofeachdatabase:
Cassandra2.1.4
Couchbase3.0.2
MongoDB3.0.3

Driverversions:
6


Cassandra:CQLcassandradrivercore2.1.5
Couchbase:2.1.2
MongoDB:mongojavadriver3.0.0

Allsettingsweredefaultforallthreedatabases.Thefollowingconfigurationsweremade:
ForCassandra,thecommitlogwasplacedonaseparatedevice
ForMongoDBthejournalwasplacedonaseparatedevice
storageEngine=wiredTiger
zlibcompressionwasenabledforusertablecollection.

YCSBtestconfiguration
400milliondocuments
2copiesofthedata
100Millionoperations
Zipfianrequestdistribution
Insomeconfigurationswhererunning100millionoperationswasprohibitivelyslow,
theworkloadwasrunwithadditionalparametermaxexecutiontime=3600which
limitedtherunto60minutes.

You might also like