Site Reliability Engineer (RPE Team)

Lancesoft

Montreal (Présentiel)
Compétences recherchées — Connectez-vous et téléversez votre CV pour comparer avec votre profil
Gestion des risques SQL Gestion des opérations +6 autres

Détails du poste

  • Lieu de travail : Montreal (Présentiel)
  • Type de poste : Permanent à temps plein

Site Reliability Engineer

Lieu

Montreal, Canada (Day 1 onboarding onsite;in-office presence required 3x/week)

Description du poste

The Reliability and Production Engineering (RPE) team is seeking talented individuals with a passion for production support and real-time problem solving. This role is part of our growing Site Reliability Engineering (SRE) capabilities within the RPE organization, supporting the Technology transformation. Successful candidates will thrive in a dynamic, fast-paced environment that values collaboration, ingenuity, and adaptability.

As a Site Reliability Engineer, you will focus on improving system service availability, observability, scalability, performance, and resilience by applying sound software engineering principles and leveraging modern tooling.

Responsabilités clés

  • Troubleshoot issues across the entire technology stack: hardware, software, applications, and networks.
  • Collaborate with engineering and development teams to design, build, and maintain reliable systems.
  • Identify and implement automation opportunities for deployment, management, and visibility of services.
  • Proactively assess and mitigate systems reliability risks.
  • Participate in global and regional support coverage, including occasional weekend on-call rotations.
  • Represent the RPE organization in design reviews and operational readiness exercises.

Qualifications & Compétences

Requis

  • Strong troubleshooting and debugging skills with ability to identify root causes.
  • Excellent communication and interpersonal skills; ability to present technical problems to non-technical audiences.
  • Solid Linux system administration experience.
  • Basic scripting skills (Python, Bash, Perl, Ruby).
  • Hands-on experience with enterprise monitoring tools (AppDynamics, Grafana, Splunk, Dynatrace).
  • Familiarity with automation/configuration/release management tools (e.G., Ansible, GitHub).
  • Awareness of modern software and systems architectures (cloud, microservices, load balancing, databases, caching, distributed systems).

Préférés

  • Practical experience supporting large-scale systems.
  • Strong analytical and problem-solving skills with a sense of ownership and accountability.
  • Ability to work effectively in a team-oriented environment.

*//EEO Employer: Minorities/ Females/ Disabled/ Veterans/ Gender Identity/ Sexual Orientation//*