In my introduction to this series, here, I assert that a lot of value is lost due to bad conversations about tech. My aim in penning these articles, is to do my little bit toward greater clarity, by getting under the skin of a few commonly used terms that, in my experience, are often not fully understood.

So the theme of today is Scalability.

There was a time when ‘scalable’ didn’t frankly mean a great deal (to me at least). In my 20s, I’d have probably thought it a slightly odd word to describe a rock face or mountain that was inviting me to have a crack at it (& yes, the pic above is me back in those days)

Then, when I was at b-school, ‘scalable’ was essentially a vague-ish notion meaning business model with minimal marginal costs, that created a kind of yellow brick super-highway to fame and fortune, without the onerous capital requirements of ‘old’ business models. Think selling stuff online, rather than bricks & mortar, or automating services that were previously F2F.

No alt text provided for this image

Whilst not stated explicitly, this definition infers a notion of ‘limitless’ bounty; the hungry jaws between revenue and cost grow ever wider, constrained only by market potential.

But can anything be truly limitless? Is it really true that shove a business model onto the web (or these days the cloud) and it can just grow, ad infinitum?

The answer of course, is ‘it depends’. And to get to a better answer, we need to get a better handle on the underlying technology.

From a tech standpoint, there are two distinct types of scalability:

(1) Vertical scalability

This is the ability to make a process go faster or handle more data by upping the size of the machine that’s doing the processing. Given the infallibility of Moore’s law, this did us pretty well for a few decades…but sadly it simply isn’t up to the data explosion of the 21st century. You simply can’t increase arbitrarily in size on a single machine; if your business grows beyond a certain point, you hit a wall. And even if you can find a machine that’s powerful enough to handle the data volumes you need, you’ll find that beyond a certain threshold, the price/CPU becomes prohibitive. (And remember, whatever the total cost, you’ll need to double up, as with this model, you’ll need a replica setup for disaster recovery.)

(2) Horizontal scalability

Instead of building ever whizzier machines, you distribute the load horizontally across an arbitrary number of machines. This essentially what fuelled the ‘big data’ revolution; humongous data centres in artic locations, the advent of cloud computing, and hey presto, everyone under 30 seems to work for a tech startup.

So far, seems pretty simple?

Well, distributed computing (i.e. horizontal scalability) works extremely well for a whole host of data-intensive applications. Think Google Search, social media, most analytics.

What do these applications have in common?

They essentially derive meaning from data in aggregate; individual pieces of data are non-critical. 

For example, would your world end if:

  • Someone follows you on Twitter, but your follower count doesn’t go up until 30 seconds after you receive the ‘new follower’ notification?
  • Staying with Twitter: your twitter feed shows you the same tweet twice.
  • The new Chinese restaurant that opened down the road yesterday doesn’t appear on your search for ‘Takeaway near me’ because the site hasn’t yet been indexed by Google?
  • You receive a notification that someone ‘liked’ your LinkedIn article (now there’s an invitation😉), but it doesn’t appear in your article analytics for another half an hour

However problems will start to arise when you want to do sequential data processing (i.e. make changes to data that need to happen in a certain order), and the outputs need to be 100% correctThat’s because you can no longer spread all the load randomly across the pool; you need to control that the right bits are correctly completed in the right order. And of course, if you have a single ‘controller’ that orchestrates the sequential processing, you are re-introducing a bottleneck and are no longer ‘horizontally scalable’. Thus you basically need some clever means of distributing the controller itself, without putting accuracy at risk..

This is where things start to get tricky.

In the old days, this sort of accuracy was delivered via relational databases. These evolved in the 1970s, largely to meet the demands of the Financial Services industry, and offer very robust means of storing, managing and controlling complex, interrelated data, whilst maintaining absolute accuracy.

But, relational databases owe their accuracy to the fact they have just one, tightly controlled, point of entry and exit. So they can never be scalable.

So how can we have our cake and eat it?

This is a conundrum that techies the world over have been struggling over for some years.

The sad reality is that there’s no such there’s no such thing as a free lunch. There’s a theoretical concept established by computer scientist, Eric Brewer, called the CAP theorem. The practical implication of this is a that any horizontally distributed system cannot guarantee both availability and consistency.  (For consistency, think accuracy; another post on that topic to come).

But it is (theoretically, at least) possible to optimise within the boundaries of the theorem for a given set of circumstances.

Thus Google rightly prioritises availability. As users, we expect Google to work all the time, however many zillions of us might be making demands of it at any one time.

In other scenarios, it might be OK to let availability slip a bit. Maybe it’s OK to have a system ‘locked down’ for a few milliseconds whilst some important sequential updates are performed?

For example, withIn financial services, with the exception of high-frequency trading, availability within a few milliseconds would be more than adequate for most applications.

So, in theory, at least we can have our squidgy chocolate cake and tuck in…

Caveat Emptor: Chocolate cake is not all created equal…

No alt text provided for this image

Take a look at the website of any software vendor or cloud service provider and you’ll see prolific use of the word ‘scalable’. Indeed, to my amusement, I discovered that ‘highly scalable’ is particularly widespread (which of course actually means ‘non-scalable’ or ‘with capacity constraints’ that the vendor is hoping you won’t breach; any genuinely scalable system will theoretically have no limits and therefore ‘highly’ would be nonsensical)

Thus, I suggest digging a little deeper into the tech pages. Do you see any mention. of ‘sqlserver’ or ‘OracleDB’? – Even if you’re non-technical, you may well have a sense that these terms sound rather old-school?

Well that’s because they are. They are old-fashioned, inherently non-scalable relational databases, that have been around since the 1980s.

These systems have not optimised the CAP theorem as I describe above, but instead have bolted together horizontally distributed approaches with old-fashioned databases, in an attempt to combine scalability with accuracy.

Thus in the vast majority of cases where ‘scalability’ is claimed for a solution that requires accuracy, they actually mean partial  scalability. The underlying architecture is in fact a hybrid of the old and new; scalable in parts of its operation, but with bottlenecks in others.

Does this matter? Do I need universal scalability?

Maybe, maybe not. But it’s certainly wise to ask lots of questions. Where exactly are those bottlenecks? -How might these impact YOUR business? If all seems OK right now, at what kind of data volume/complexity might they become an issue? – Are you confident that your business will remain within these thresholds for the lifetime of the system? (Remembering that in years to come you will most likely want to track information you can’t even conceive of today.)

And on a related note: How does the system handle failures? – Does the whole system shut down when one machine fails? What level of manual intervention is required to get it up and running again? How much will that kind of failover cost? – One of the benefits of genuine horizontal scalability is that the ability to recover from faults typically comes ‘out-of-the-box’; the load, is simply re-distributed automatically to functioning machines.

The missing ‘C’: Complexity

So why would anyone bolt together the old & new, if it was possible to build a solution that would scale limitlessly to meet any future business need?

Because building an infinitely scalable and accurate system is hard. Really hard. And the resultant architecture is typically (although not necessarily) highly complex.

One much-touted solution for combining agility & scalability is Microservices, which can work really well in some applications (I’ll be covering in a lot more detail in a future article). The issue here is that each microservice typically scales independentlywith its own database. That’s fantastic, so long as you don’t need services to talk too much to each other….The issue in industries like banking is that you typically need a lot of interactions between services, which means each database must be kept in synch, typically leading to a nuclear-style mushrooming in complexity….

Of course, the holy grail of scalability + accuracy + agility is not impossible, as the likes of Spotify, Monzo, Netflix have demonstrated. But to get there, these firms have hired some of the world’s most talented developers and have created engineering cultures which live & breathe agile.

The reality is that most organisations can’t attract ‘rock-star’ talent, and neither is their organisational culture compatible with the wholehearted adoption of agile principles that is needed to solve these gnarly technological challenges. (Of course, they may well have hired ‘Head of Agile Development’, but if you have a ‘hero’ culture in which the successful climb the tree on the basis of taking credit for the successes and deflecting blame for the failures, then you can never be truly agile.)

The banking industry is a great example. We’ve spoken to several tier 1 banks who have literally spent 100s of £m trying to build core operational systems that combine scalability and accuracy, and never got it to work, or de-scoped it to a fraction of the original ambition. This stuff is HARD, and if you’re not careful, you will drown in complexity, or create something that is un-maintainable.

Thus, the hybrid approach of partial scalability remains the norm for now.

And what about elasticity?

The terms ‘elasticity’ and ‘scalability’ are often used interchangeably, but in fact have different meanings.

In simplistic terms, ‘scalable’ refers to the ability of a system to be sized in line with planned capacity requirements, which usually includes additional headroom to handle peak demand.

But ‘elasticity’ refers to the ability of a system to flexibly scale resources up and down in line with demand. Elasticity confers immense benefits for cloud-based applications with highly variable demand, as resources may be purchased ‘as needed’, rather than permanently paying for sufficient infrastructure to meet peak requirements. (Think an online retailer who processes 25% of annual transactions over Black Friday weekend).

Indeed, it’s a common misconception that ‘cloud’ means elastic. ‘Cloud’ certainly offers the potential to source computing power on a cost-efficient basis, but running a non-elastic system in the cloud won’t magically convert it into an elastic one.

So – what does this mean for my business?

A few key take-aways:

  • Beware of ‘scalability’ claims. Does the architecture include any non-scalable components? How might these impact your business?
  • Do you genuinely need accuracy? In all areas of your business or in parts?
  • How might your data grow in the future? Will your chosen solution remain fit-for-purpose in the long-term?
  • How much complexity will your chosen approach entail? – Will it impact reliability & maintainability? How easy/hard will it be to change it as your business evolves?
  • How stable are your data and processing requirements? – Do you need to run sets of computationally onerous reports on a daily or periodic basis? – Would an elastically scalable system unlock material savings in running costs?

At Cyoda, horizontal scalability is something we’re passionate about. We’ve devoted over 35 man-years to tackling the challenge of building a technology that makes it easy to build scalable, accurate systems FAST. And we’ve found a way to do it that’s based on a very simple architecture that’s super-reliable and easy to change. Our aspiration is to make building complex, elastically scalable systems a ‘piece of cake’.

If you’d like to have a chat about the challenges of scalability, and how it can be achieved, whilst maintaining accuracy and flexibility, I’d love to chat.

Thank you,

PS: Watch out for the next in the Digital De-mystified series, coming soon to shed light on consistency & transactions