Software Engineer – Reliability & Scale

Berlin, Germany · Full-time · Senior

About The Position

Coralogix is a modern, full-stack observability platform transforming how businesses process and understand their data. Our unique architecture powers in-stream analytics without reliance on expensive indexing or hot storage. We specialize in comprehensive monitoring of logs, metrics, trace and security events with features such as APM, RUM, SIEM, Kubernetes monitoring and more, all enhancing operational efficiency and reducing observability spend by up to 70%.


As a Software Engineer at Coralogix you’ll join the Reliability & Scale team of our Platform Group, that discovers, diagnoses and fixes the hardest cross-service, performance and architectural issues. In this role, you will:

  • Influence platform architecture, establishing design patterns for resiliency, scalability, and maintainability.
  • Help other teams, make their services more reliable and efficient, while building up and spreading the knowledge and the best practices.
  • Build systems, tools and libraries, so that the rest of the engineering can ship faster and more safely.
  • Find and solve issues in an continuous effort to improve our platform.
  • Be hands-on combining software development, platform engineering and advanced troubleshooting of complex production issues.
  • Contribute to open-source projects aligned with Coralogix’s infrastructure needs.
  • Work cross-functionally with engineering, product, security, and other teams to ensure reliability, scalability, and performance across the organization.


Requirements


  • This role requires the candidate to be located in Europe due to time zone alignment and regional market focus.
  • Software Engineering Expertise: 5+ years of development experience, preferably in Rust or Scala. Experience with Python is a plus.
  • Troubleshooting & Production Expertise: Proven ability to debug large-scale, high-volume production systems with a strong understanding of distributed principles.
  • Strong System Design Skills: Experience designing modular, scalable architectures and well-defined and future-proof APIs.
  • Performance Optimization Experience: From optimizing the architecture of the system as a whole, to picking the most appropriate data structures and algorithms, to profiling and tuning the low-level implementation details.
  • Distributed Systems & Platform Skills: Hands-on experience with KafkaKubernetes, and microservices at scale.
  • Reliability & Observability: Proficiency in monitoring, logging, tracing, alerting (Prometheus, Grafana, Coralogix, etc.) and defining SLIs/SLOs.

Preferred Qualifications

  • High-Volume Data Pipeline Experience: Familiarity with optimizing throughput and reliability in event-driven architectures.
  • Infrastructure/DevOps Mastery: Skilled in containerization, CI/CD workflows, and Infrastructure-as-Code (Terraform, Helm, etc.).
  • Cloud Experience: Proficiency in AWS, GCP, or Azure as production environments.
  • Community Leadership: Past involvement in SDE/SRE/DevOps communities, conferences, or meetups; experience sharing and demonstrating best practices.
  • Open-Source Engagement: Evidence of contributions to open-source communities or tools (e.g., patches, PRs, discussions).


Cultural Fit

We’re seeking candidates who are hungry, humble, and smart. Coralogix fosters a culture of innovation and continuous learning, where team members are encouraged to challenge the status quo and contribute to our shared mission. If you thrive in dynamic environments and are eager to shape the future of observability solutions, we’d love to hear from you.

Coralogix is an equal opportunity employer and encourages applicants from all backgrounds to apply.

Apply for this position