Ozon Performance Testing Service - HighLoad by Schedule

Ivan Prihodko

from Ozon Tech

About speaker

12 years in software tesing.
ISTQB Certified.
Work from e-comm to telecomm domains.
From manual UI testing - to Test Automation.
From Test Automation to BackEnd testing.
And from BackEnd to Performance Testing.

About speakers company

Ozon is the leading e-commerce company in Russia. Our IT team consists of 5 thousand specialists who make products for millions of people all through Russia and abroad. Ozon Tech develops solution of their own, contribute to Open Source, and use a modern stack, including Go, C#, Kotlin, Swift TypeScript, Vue.js, Kubernetes, and Kafka. We keep on expanding to make our services more available and to stay closer to our users.



Million RPS on Demand. Highload By Schedule. How works Ozon Performance Testing platform.

About topic:
Ozon is growing up twice every year since 2019. Main technologies are: Go/ C#, gRPC, Kafka, A lot of code Generation, S2S Routing and security etc.
Performance testing platform was started as service, that helps regular Ozon engineers starts Performance Tests by one cli-util command or one click on UI.

Also, there are 3 main paradigms in Ozon Performance testing:
1) Confidence from Performance tests, can be achieved only on Production.
2) Bandwidth target for next season for concrete service, calculates by analytics, and uploads to Performance Testing platform.
3) Once a week, IT management, looks at consolidated performance reports, provided by our performance testing platform and collect confidence of Readiness Ozon for upcoming Season.
The platform grew with Ozon. It contains several microservices, that helps us and our users to start performance tests easier.
We standardize and simplified most performance Testing activities and provides a lot of integrations with Ozon Infrastructure.
Also we created CPU-effective load generators for http, gRpc and scenario traffic.

All activities, described below accompany by problems, caused by HighLoad nature of our environment.
Such as:
- Problems with payload collection system and Kafka.
- Load Generators, that generate too much load for our Performance testing system.
- Statistics, that overload our statistics Storage.
- Bandwidth limits in our k8s cluster, that was reached one day.
- CPU limit, also was reached one day.
How we solved all these problems, we run 27 thousand tests a month, as well as how we generate more than a Million RPS on production every night, you will learn at my speech.

The talk was accepted to the conference program

other talks of this topic