from Yandex 360
Lead Sortware Engineer at Yandex 360
14 years in commercial Java development. I started with full-stack development in Order Capture and Order Management systems in world-class Enterprise solutions.
About speakers company
Yandex 360 provides a full-spectrum SaaS stack for individual and corporate use Including Yandex Telemost Yandex Disk Yandex Messenger Yandex Mail and more
We will discuss concrete examples from Yandex 360
* In Yandex Telemost, when we do broadcasts, or manage incoming calls from meeting rooms, we need to allocate heavy VMs as resources. They require warmup, authorization and healthchecks, because at scale any one of these VMs can break or go rogue at any time. And only one of these instances can serve a stream at any given moment.
We need to maintain up to 99.99% availability of these services. We have specific rules how it is calculated, and we can formally minimize downtime from planned updates with the help of a wisely chosen strategy. We have historical data at our disposal to test theories. And we have been using it.
* Sometimes these services are so imbalanced in CPU/RAM ratio, that we need to host multiple user sessions within one container. Otherwise RAM consumption and overhead on PaaS would be enormous. In this case we need to orchestrate a 2-layered service with all of the requirements from above.
* Yandex Telemost is based on Jitsi. It holds multi-user sessions across several distinct components in memory. And registers in several discovery systems to organize the calls. Special care should be taken to prevent unintentional random split of conferences into several independent rooms. Or to prevent a rogue POD from intercepting one of the traffic channels thus making it impossible for users to join a particular conference at all
based on these examples, we are going to discuss
* problem of stateless single-pod services. And our approach to their managent and maintenance
* how we can calculate and minimize donwtime of these stateful in-memory components and enable more frequent releases
* why there should be only one service discovery
* how split brain-like situations emerge from scaling a component that provides multi-user sessions. When the component is not built for scaling world-wide. And we did to address the issue.
The talk was accepted to the conference program
Neom, ex-ABN AMRO
Andrei Kvapil (kvaps)
The largest professional conference for developers of high-load systems
The price is soaring — the closer the conference is, the more it costs.
The current price of a ticket is — 450 EUR
Changed your mind?
Please tell us why.
Thank you for your reply!
Professional conference for developers of high-load systems