Site Reliability Engineer (P673)
About Us:
As a SR Site Reliability Engineer at Kenility, you’ll join a tight-knit family of creative developers, engineers, and designers, who strive to develop and deliver the highest quality products into the market.
Technical Requirements:
- Bachelor's degree in Computer Science or Information Technology, or a comparable qualification.
- Extensive hands-on experience with Linux systems, focusing on optimization, troubleshooting, and performance.
- Strong proficiency in Python for scripting, automation, and system management.
- Deep understanding of Kubernetes (K8s) and related container technologies like Docker or Podman. Experience with Slurm is a plus.
- Proven experience with Helm, Terraform, and Ansible for scalable infrastructure management.
- Proficiency in GitLab CI/CD, GitHub Actions, or similar tools to maintain efficient deployment workflows.
- Hands-on experience with monitoring tools like Prometheus, Grafana, or the ELK stack to oversee system health and performance.
- Understanding of database performance tuning and management in distributed systems.
- Familiarity with Agile development methodologies.
- Experience with cloud platforms like AWS or Google Cloud is highly desirable.
Soft Skills:
- Responsibility
- Proactivity
- Flexibility
- Great communication skills