Senior Platform/SRE Engineer

Andela

Located in LATAM; Main job time zone (UTC -8) Los Angeles, USA Time zone overlap requirements 3 hours minimum Remote only

Full time

Dev Ops Engineer

{{field.value|getBooleanValue}}

May 2

ROLE

Senior Site Reliability Engineer/DevOps

 

ABOUT THE ROLE

Develop infrastructure and tooling solutions for complex product engineering projects towards a business goal; ship software that matters to our customers and to our company

Scale application infrastructure to handle over 1M requests per minute using pragmatic & distributed architectures

Develop a fully automated observability stack based on the existing SaaS system, and extend it to predict capacity needs based on the usage patterns

Improve engineering velocity by implementing best practices and frameworks; improve coding efficiency and quality

Be a steward of quality, scalability, security, and performance. You'll work with other engineers to ensure that we have a solid foundation that serves our customers, and enables the team to continue building a great product

Drive sound, data-driven decision-making; analyze data insights to uncover opportunities to improve architecture for a great customer experience

Implement excellence in engineering process & culture across teams

Collaborate with product and engineering teams effectively and with empathy; promote technical learning across teams

Mentor and advocate for junior engineers; push the boundary of their comfort zone. Build trust and respect in the team

Interview and evaluate engineering candidate technical capabilities to help grow our engineering team

 

MUST HAVES

You’re experienced with scaling and shipping complex, distributed services in AWS or equivalent using infrastructure as code

You’re an expert at building infrastructure and tools that make it simple to develop and run code; your stakeholders love to use the things you build

You have a wide understanding of the system and application architectures, and have a strong observability background

You have strong programming fundamentals, ideally in a variety of languages like Ruby, Python, or Node

You’re an expert at measuring query latencies, resource allocation and management

You have experience deploying to and orchestrating containers (Docker, Kubernetes, etc.)

You prioritize automation and continuous testing in order to optimize speed while simultaneously enhancing quality and security

You have an understanding of operational toil, observability, performance, and scalability

You are familiar with incident response and management tools like PagerDuty

OVERLAP

6 hours overlap with MST (GMT-6)

HARDWARE/VPN REQUIREMENTS

None

OFFER

Full-time, long-term


Each applicant is required to complete a skills assessment and an English assessment to be considered for this role. Applicants are required to have a minimum of 3 years of work experience.

Apply for this position Back to job

You must be logged in to to apply to this job.

Andela

Connecting brilliance with opportunity

{{notification.msg}}