Professional Documents
Culture Documents
Architectures you
Always
Wondered
About
Lessons learnt from adopting Microservices
at eBay, Google, Gilt, Hailo and nearForm
eMag Issue 31 - August 2015
ARTICLE
INTERVIEW
PRESENTATION SUMMARY
Microservice
trade-offs
Eric Evans on
DDD at 10
Service Architectures
at Scale
Martin Fowler on
Microservice tradeoffs
Many development teams have found
the microservices architectural style to
be a superior approach to a monolithic
architecture. But other teams have found
them to be a productivity-sapping burden. Like any architectural style, microservices bring costs and benefits. To make
a sensible choice you have to understand
these and apply them to your specific
context.
Evolutionary Architecture
Randy Shoup talks about designing and building microservices based on his experience of working at large companies, such as Google and eBay. Topics covered include the
real impact of Conways law, how to decide when to move
to a microservice-based architecture, organizing team
structure around microservices, and where to focus on the
standardization of technology and process.
After living with microservices for three years, Gilt can see advantages in team ownership, boundaries defined by APIs, and complex
problems broken down into small ones, Yoni Goldberg explained in
a presentation at the QCon London 2015 conference. Challenges
still exist in tooling, integration environments, and monitoring.
FOLLOW US
CONTACT US
GENERAL FEEDBACK feedback@infoq.com
ADVERTISING sales@infoq.com
EDITORIAL editors@infoq.com
facebook.com
/InfoQ
@InfoQ
google.com
/+InfoQ
linkedin.com
company/infoq
A LETTER FROM
THE EDITOR
This eMag has had an unusual history. When we
started to plan it the intent had been to look at
the different architectural styles of a number of
the well known Silicon Valley firms. As we started to work on it though it become apparent that
nearly all of them had, at some level, converged
towards the same architectural style - one based
on microservices, with DevOps and some sort of
agile (in the broadest sense) management approach.
According to ThoughtWorks Chief Scientist
Martin Fowler the term microservice was discussed at a workshop of software architects near
Venice in May 2011, to describe what the participants saw as a common architectural style that
many of them had begun exploring recently. In
May 2012, the same group decided on microservices as the most appropriate name.
When we first started talking about the microservices architectural style at InfoQ in 2013, I
think many of us assumed that its inherent operational complexity would prevent the approach
being widely adopted particularly quickly. Yet a
mere three years on from the term being coined
it has become one of the most commonly cited
approaches for solving large-scale horizontal
scaling problems, and most large web sites including Amazon and eBay have evolved from a
Read on martinfowler.com
Martin Fowler is an author, speaker, and general loud-mouth on software development. Hes
long been puzzled by the problem of how to componentize software systems, having heard
more vague claims than hes happy with. He hopes that microservices will live up to the early
promise its advocates have found.
Independent Deployment:
Strong Module
Boundaries
The first big benefit of microservices is strong module boundaries. This is an important benefit
yet a strange one, because there
is no reason, in theory, why a microservices should have stronger module boundaries than a
monolith.
So what do I mean by a
strong module boundary? I think
most people would agree that its
good to divide up software into
modules: chunks of software that
are decoupled from each other.
You want your modules to work
so that if I need to change part
of a system, most of the time I
only need to understand a small
part of that system to make the
change, and I can find that small
part pretty easily. Good modular
structure is useful in any program, but becomes exponentially more important as the software grows in size. Perhaps more
importantly, it grows more in importance as the team developing
it grows in size.
Advocates of microservices
are quick to introduce Conways
Law, the notion that the structure of a software system mirrors
the communication structure
of the organization that built it.
With larger teams, particularly if
these teams are based in different locations, its important to
structure the software to recognize that inter-team communications will be less frequent and
more formal than those within a
team. Microservices allow each
team to look after relatively inde6
Complexity: You
need a mature operations team to
manage lots of services, which are
being redeployed regularly.
Operational
Distribution
So microservices use a distributed system to improve modularity. But distributed software has a
major disadvantage, the fact that
its distributed. As soon as you
play the distribution card, you
incur a whole host of complexities. I dont think the microservice community is as naive about
these costs as the distributed
objects movement was, but the
complexities still remain.
The first of these is performance. You have to be in a really
unusual spot to see in-process
function calls turn into a performance hot spot these days, but
liths are truly self-contained, usually there are other systems, often legacy systems, to work with.
Interacting with them involves
going over the network and running into these same problems.
This is why many people are inclined to move more quickly to
microservices to handle the interaction with remote systems.
This issue is also one where experience helps, a more skillful team
will be better able to deal with
the problems of distribution.
But distribution is always a
cost. Im always reluctant to play
the distribution card, and think
too many people go distributed
too quickly because they underestimate the problems.
Eventual
Consistency
Im sure you know websites that
need a little patience. You make
an update to something, it refreshes your screen and the update is missing. You wait a minute
or two, hit refresh, and there it is.
This is a very irritating usability problem, and is almost
certainly due to the perils of
eventual consistency. Your update was received by the pink
node, but your get request was
handled by the green node. Until
the green node gets its update
from pink, youre stuck in an inconsistency window. Eventually it will be consistent, but until
then youre wondering if something has gone wrong.
Inconsistencies like this are
irritating enough, but they can
be much more serious. Business
logic can end up making decisions on inconsistent information, when this happens it can
be extremely hard to diagnose
what went wrong because any
investigation will occur long after the inconsistency window
has closed.
Microservices
introduce
eventual consistency issues because of their laudable insistence
Microservices
are the first post
DevOps revolution
architecture
- Neal Ford
Independent
Deployment
The trade-offs between modular
boundaries and the complexities
of distributed systems have been
around for my whole career in
this business. But one thing thats
changed noticeably, just in the
last decade, is the role of releasing to production. In the twentieth century production releases
were almost universally a painful
and rare event, with day/night
weekend shifts to get some awkward piece of software to where
it could do something useful. But
these days, skillful teams release
frequently to production, many
8
Operational
Complexity
Being able to swiftly deploy
small independent units is a
great boon for development, but
it puts additional strain on operations as half-a-dozen applications now turn into hundreds of
little microservices. Many organizations will find the difficulty of
handling such a swarm of rapidly
changing tools to be prohibitive.
This reinforces the important role of continuous delivery.
While continuous delivery is a
valuable skill for monoliths, one
thats almost always worth the
effort to get, it becomes essential
for a serious microservices setup. Theres just no way to handle
dozens of services without the
automation and collaboration
that continuous delivery fosters.
Operational complexity is also
increased due to the increased
demands on managing these
services and monitoring. Again
a level of maturity that is useful
for monolithic applications becomes necessary if microservices
are in the mix.
Microservice proponents
like to point out that since each
service is smaller its easier to
understand. But the danger is
that complexity isnt eliminated, its merely shifted around to
the interconnections between
services. This can then surface
as increased operational complexity, such as the difficulties in
debugging behavior that spans
services. Good choices of service
Technology
Diversity
Secondary Factors
It's important
to stress that it's
perfectly possible to
have firm module
boundaries with
a monolith, but it
requires discipline.
Similarly you can
get a Big Ball of
Microservice Mud,
but it requires more
effort to do the
wrong thing
10
Summing Up
Any general post on any architectural style suffers from
the Limitations Of General Advice. So reading a post like this
cant lay out the decision for
you, but such articles can help
ensure you consider the various
factors that you should take into
account. Each cost and benefit
here will have a different weight
Acknowledgements
Brian Mason, Chris Ford, Rebecca
Parsons, Rob Miles, Scott Robinson, Stefan Tilkov, Steven Lowe,
and Unmesh Joshi discussed
drafts of this article with me.
11
Listen on SE Radio
Eric Evans on
Domain-Driven Design at 10 Years
THE INTERVIEWEE
Eric Evans is the author of Domain-Driven Design: Tackling Complexity in Software. Eric now
leads Domain Language, a consulting group which coaches and trains teams applying domaindriven design, helping them to make their development work more productive and more valuable to their business.
THE INTERVIEWER
Eberhard Wolff works as a freelance consultant, architect and trainer in Germany. He is currently interested in Continuous Delivery and technologies such as NoSQL and Java. He is the
au-thor of several books and articles and regularly speaks at national and international conferences.
13
Do you think there are any circumstances where a DDD approach would fail? And how
would you deal with them or is
it something that can be made
to work in any project?
So there are a few aspects to that.
Thats an interesting question
because certainly DDD projects
fail all the time. Its not unusual.
Of course, some of that is just
anything difficult fails sometimes
so we neednt worry about that.
And I think DDD is hard. So what
would make a DDD project more
likely to fail than other times? I
think that some of the most common things are there is a tendency to slip into perfectionism:
whenever people are serious
about modeling and design, they
start slipping toward perfectionism. Other people start slipping
15
17
19
20
graph databases, since I did mention graphs, but there are things
that are really nicely modeled as
graphs. If you say, How am I going to model this thing? sometimes people think modeling
means OO modeling. Oh, I have
to draw a UML diagram of it and
then implement it in C# or Java.
Thats not what modeling means.
Modeling means to create abstractions that represent important aspects of your problem and
then put those to work.
So sometimes the natural
abstraction is a graph. You want
to say, well, how do these people
relate to each other? You know,
the graph databases, Neo4j
and things like that, allow us to
choose a tool that actually fits
the kind of problem were trying
to solve. I dont now have to twist
it into objects and then figure
out how to do graph logic over
objects while, by the way, Im also
stuffing the object data into a relational database. Instead, I use
a graph database and ask graph
questions using a graph query
language. This is the world of
NoSQL to me that we can choose
a tool that fits well with the problem were trying to solve.
21
Watch on InfoQ
Randy Shoup has experience with service architecture at scale at Google and eBay. In his talk,
Service Architectures at Scale: Lessons from Google and eBay, he presents the major lessons
learned from his experiences at those companies.
Evolution of service
architectures
Service architectures of largescale systems, over time, seem to
evolve into systems with similar
characteristics.
In 1995, eBay was a monolithic Perl application. After five
rewrites, it is a set of microservices written in a polyglot of programming languages. Twitter,
on its third generation of architecture, went from a monolithic
Rails application to a set of polyglot microservices. Amazon.com
started out as a monolithic C++
application and moved to services written in Java and Scala.
Today, it is a set of polyglot microservices. In the case of Google
and eBay, there are hundreds to
22
Standardization without
central control
Standardizing the communication between IT services and the
infrastructure components is
very important.
At Google, there is a proprietary network protocol called
Stubby. Usually, eBay uses RESTful HTTP-style data formats. For
serialization formats, Google
uses protocol buffers; eBay tends
to use JSON. For a structured way
of expressing the interface, Google uses protocol buffers, eBay
usually uses a JSON schema.
Standardization occurs naturally because it is painful for
a particular service to support
many different network protocols with many different formats.
Common pieces of infrastructure are standardized without central control. Source-code
control, configuration-management mechanisms, cluster management, monitoring systems,
alerting systems, diagnostic debugging tools all evolve out of
conventions.
Standards become standards not by fiat, but by being
better than the alternatives.
Standards are encouraged rather
than enforced by having teams
provide a library that does, for
example, the network protocol.
Service dependencies on particular protocols or formats also encourage it.
Code reviews also provide a means for standardization. At Google, every piece of
code checked into the common
source-control system is reviewed by at least one peer programmer. Searching through
the codebase also encourages
standardization. You discover if
somebody else has done what
you need. It becomes easy to do
the right thing and harder to do
the wrong thing.
Nonetheless,
there
is
no standardization at Google
around the internals of a service.
There are conventions and common libraries, but no standardization. The four commonly used
programming languages are
C++, Java, Python, and Go. There
is no standardization around
frameworks or persistence mechanisms.
Proven capabilities that are
reusable are spun out as new
services, with a new team. The
Google File System was written
to support search and as a distributed, reliable file system, others used it. Bigtable was first used
by search, then more broadly.
Megastore was originally built
for Google application storage.
The Google App Engine came
from a small group of engineers
who saw the need to provide
a mechanism for building new
webpages. Gmail came out of an
internal side project. App Engine
and Gmail were later made available for the public.
When a service is no longer used or is a failure, its team
members are redeployed to other teams, not fired. Google Wave
was a failure, but the operational
transformation technology that
allowed real-time propagation
of typing events across the network ended up in Google Apps.
The idea of multiple people being able to concurrently edit a
document in Google Docs came
straight out of Google Wave.
23
More common than a service being a failure is a new generation, or version of a service
that leads to deprecating the
older versions.
Building a service as a
service owner
A well-performing service in
a large-scale ecosystem has a
single purpose, a simple and
well-defined interface, and is
very modular and independent.
Nowadays, people call these microservices. While the word is
relatively new, the concept is relatively old. What has happened
is that the industry has learned
from its past mistakes.
A service owner has a small
team, typically three to five people. The teams goals are to provide client functionality, quality
software, stable performance,
and reliability. Over time, these
metrics should improve. Given
a limited set of people and resources, it makes sense to use
common, proven tools and infrastructure, to build on top of other services, and to automate the
building, deploying, operating,
and monitoring of the service.
Using the DevOps philosophy, the same team owns the
service from creation to deprecation, from design to deployment
to maintenance and operation.
Teams have freedom to choose
their technologies, methodologies, and working environment.
They also have accountability for
the results.
As a service owner, you are
focused on your service, not the
hundreds to thousands of services in the broader infrastructure. You do not have to worry
about the complete ecosystem.
There is a bounded cognitive
load. You only need, as they say
at Amazon, a team large enough
to be fed by two large pizzas. This
both bounds the complexity and
makes for high-bandwidth com-
24
Anti-patterns
You can never have too much
monitoring. You can have too
much alerting, so you want to
avoid alert fatigue.
Service-oriented architecture has gotten a bad name, not
because the ideas were wrong
but because of the mistakes that
industry made along the way
from lack of experience.
One anti-pattern is a service
that does too much. Amazon.
com, eBay, Twitter, and Google
have ecosystems of tiny, clean
services. A service that is too
large or has too much responsibility ends up being a miniature
monolith. It becomes difficult
to understand and very scary to
change. It ends up increasing or
instilling lots more upstream and
downstream dependencies than
you would otherwise want.
Shared persistence is another anti-pattern. If you share a
persistence layer among services,
you break encapsulation. People
can accidently, or on purpose, do
reads and writes into your service and disrupt it without going
through the public interface. You
end up unwittingly reintroducing coupled services.
The modern approach of
microservices has small, isolated
services independent of one another. The resulting ecosystems
are healthy and growing.
Nobody at Google
has the title of
architect. There is
no central approval
for technology
decisions. Most
technology
decisions are made
by individual teams
for their own
purposes.
25
Watch on InfoQ
In the early phase of a startup, we do not even have a business model, we dont have product market fit, we do not have a
product. So, it is inappropriate, I
think, to think about any architecture or even any technology.
If a WordPress blog or buying
ads on Google is the right way
for you to test your hypothesis
about how to move forward, you
should totally do that and not
build anything. Then there is a
phase where we have a product
market fit and we think people
are willing to pay for it, and now
we are trying to grow that business and typically that is slower
than we would like to ramp up.
Again, that is a situation where
we started from minimal. It is not
about the technology and it is
certainly not about scaling that
technology or the organization.
We typically have a group
of people that can fit around the
conference table. This is not the
point at which to split the architecture up into small services,
divide into small teams, etc. That
comes later! Right now, we are
one team and we are building
one thing: the simplest thing
that could possibly work. Then,
one hopes that you will start to
hit the limits of the monolithic
27
28
Again, every successful company has evolved. Ill say it another way: no successful company
that we have ever heard of has
the same architecture today that
it had when it started. Dont get
too bitter and angry with yourself that the first thing you try is
not the thing that lasts forever.
In fact, if you had done the thing
that was going to live for five or
10 years when you started out,
we would have probably never
heard of you because you would
have spent all your time building
for some far future that never
came rather than building things
that met near-term customer
needs in near term.
29
Read on InfoQ
Richard Rodger is a technology entrepreneur who has been involved in the Irish Internet
industry since its infancy. Richard founded the Internet startup Ricebridge.com in 2003. He
subsequently joined the Telecommunication Software and Systems Group (TSSG) and became
CTO of one of its successful spin-off companies, FeedHenry Ltd. More recently, he became
CTO and founder of nearForm.com. Richard holds degrees in computer science (WIT) and
mathematics and philosophy (Trinity College, Dublin). Richard is a regular conference speaker
and is a thought leader on system architectures using Node.js. Richard is the author of Mobile
Application Development in the Cloud, published by Wiley. He tweets at @rjrodger and blogs
here.
Feidhlim ONeill has spent over 20 years working in a variety of tech companies in the UK and
US, from startups to NASDAQ 100 companies. He spent 10 years at Yahoo in a variety of senior
positions in service and infrastructure engineering. Feidhlim works at Hailo where he oversees
their new Go-language microservices platform built on AWS.
30
We interviewed representatives
from three companies Gilt,
Hailo, and nearForm who have
agreed to share their experiences
in either building a microservices
platform from scratch or in re-architecting a monolithic platform
by gradually introducing microservices. The interviewees are:
Adrian Trenaman, SVP of engineering at Gilt; Feidhlim ONeill,
VP of platform and technical
operations for Hailo; and Richard
Rodger, CTO of nearForm.
31
33
Watch on InfoQ
Since 2010, Lead Software Engineer Yoni Goldberg has led the engineering behind several
critical projects at Gilt--including personalization, the Gilt Insider loyalty program, SEO/
optimization, and other customer-facing initiatives.
After living with microservices for three years, Gilt can see advantages
in team ownership, boundaries defined by APIs, and complex problems
broken down into small ones, Yoni Goldberg explained in a presentation
at the QCon London 2015 conference. Challenges still exist in tooling,
integration environments, and monitoring.
Goldberg, lead software engineer at Gilt, describes the company as a flash-sales business. A
typical sale offers a limited but
discounted inventory starting at
a specific time and running for a
specific period, usually 36 hours.
With tens of thousands of people
coming to the website at once
to buy items, Gilt experiences an
extreme and short spike in traffic
that generates about 80% of the
revenue. Every decision that may
affect website performance has
to take into consideration this
traffic spike of 50 to 100 times
the regular traffic.
34
As a traditional startup in
2007, Gilt used Ruby on Rails,
PostgreSQL, and Memcached.
Things went great but two years
later they had a 200,000-line
codebase and increasing traffic
overloaded the thousands of required Ruby processes running
with a database. With everyone
working on the same codebase,
deployment could take up to two
weeks due to all integration tests
needed. The biggest hurdle was
that if something went wrong,
they had a really hard time finding the root cause.
Macro/microservices
era
At this point, besides moving to
the JVM, Gilt entered what Goldberg calls a macro/microservices
era. He distinguishes between
a macroservice that handles a
specific domain, e.g. sales or payments, and a microservice that
you get by breaking a macroservice down into smaller services.
Gilt created 10 macroservices for
the core business, services that
are still in use. With all other services depending on these, the
core services need to perform
with good SLAs to keep downtime to a minimum. Their checkout service is one example of a
core: when that service is not
responding, users cant place
orders and the company then
doesnt make any money. A set
of less-critical supporting services, e.g. for user preferences,
use the core services; these are
good for the user experience but
the business will still function if
one goes down. On top of these
supporting services, another set
of services generates views for
all users.
While Gilt built services, it
also introduced dedicated data
stores for each service, providing
the best database for each ones
need. This new architecture
solved 99% of their scaling problems but left developers with
some of the problems as the
new services were semi-monolithic and lacked clear ownership
of code. Problems with deployments and long integration cycles remained. The main problem, though, was that it wasnt
fun to develop code.
Moving to microservices
To overcome the remaining
problems, Gilt created a lot more
microservices and empowered
teams to take responsibility for
not only developing a service
but also for testing, deploying, and monitoring it. This also
clarified ownership; a team basically became the owner of a
service. According to Goldberg,
the biggest benefit came from
the smaller scope of a microservice, which made it easier to
grasp. Its easy to understand a
service composed of just a few
thousand lines, and to understand another teams microservice when you move there to
contribute as a developer. The
architecture removed the huge
pain point of deployment dependency among teams. Now,
they could move to continuous
Current challenges
Despite Gilts successful move
from Rails to a microservices
architecture, Goldberg emphasizes that the company still has
some core challenges.
Deployment
From the start, each team
semi-manually deployed services with its own, different
method. A lack of integration
made it hard to execute tests to
make sure a change didnt break
something else. Gilt solved this
by building a tool around sbt
that helped teams to first deploy to an integration test environment and then to release to
production. During the last year,
the company has been working
to bring operations to the teams,
adopting Docker and moving to
the cloud. One downside Goldberg notes is that deployments
now are slower, but he hopes
that it will speed up in the coming years.
APIs
During the last year, Gilt has
been moving away from a RPC
style of communication, instead
building REST APIs. The main advantage Goldberg sees is that a
well-defined API solves a couple
of problems, most importantly discoverability. Because all
APIs are available in one place,
finding what is available can be
done with one search. The API
35
Takeaway
For Goldberg, the biggest advantage Gilt
has gained from microservices is ownership by team. He believes that when team
members own a service, they tend to treat
it like their baby. Another big promise of
microservices he mentions is the breaking
of complex problems into small ones that
everyone can understand, one service at a
time.
Two challenges that Goldberg thinks
still stand are monitoring, for lack of tooling, and the integration and developer environment.
Goldbergs final advice for starting
with microservices is to begin with a feature that does not yet exist. Build that as
a microservice. He thinks that it is hard to
get acceptance to break down something
that already exists and works. Building
something new will be much easier for
people to accept.
PREVIOUS ISSUES
29
28
Advanced DevOps
Toolchain
30
Description, Discovery,
and Profiles - The
Next Level in Web APIs
In this eMag we provide both implementation examples and comparisons of different possible approaches on a range of topics from immutable infrastructure
to self-service ops platforms and service discovery. In
addition, we talk about the Docker ecosystem and the
different aspects to consider when moving to this increasingly popular system for shipping and running applications.
27