Skip to content

Group 1 - Groups of scalabillity problems

Introduction

This group will focus on looking into how we can categorize the different scalability challenges. Typically by breaking down scale like

  • data volume
  • data complexity
  • request volume
  • computation complexity
  • .. and so forth...

Group members...

  • Tobias Torrissen
  • Kjetil Valstadsve
  • Johannes Brodwall
  • Emil Eifrem
  • Finn-Robert Kristensen

The result..

Runtime scalability:

Request load * Read ** Random ** Sequential * Write ** Append-only ** Random ** process and forward: large datasets who needs to be aggregated * Prio?

Data load * size * complexity ** connectedness ** semi-structure * volatility?

Consisteny requirements * availablitity, reliability, freshness * overall: when does an actor require to see the effects of the operations (its own and others)

Development time scalability:

  • "What happens when you get many of---"
  • KLOC size ** unintended consequences
  • Distributed services/components size ** performance ** change cost (formalized) ** debugging
  • technologies complexity ** open source frameworks + libraries + languages ** => assumptions inherited ** => interactions n^2
  • Developers - scaling numbers of developers

Operation scalability

  • Layers ** infrastructure ** OS ** Platform (app server) ** App
  • Concerns ** Monitoring/management ** Configuration ** Deployment ** Security

Causality / dependency suite

  • Problem domain ** => Runtime issues * => development issues **** => Operational issues

Case study: Twitter

Data is no naturally partitionable High random reads High write, append mostly Milk data - interested in recent data

Case study: Instrument data acquisition

10" write clients 10" packets/sec Compress and store Aggregate data

Case study: Green driving:

Each 1km road has a price 1-10 cents

Norway 2 billion sensors Peak of number of cars 500" Each car drives at 60 km/h 15 million messages/sec -> sends: carId + sensorId Extremely partitionable