Ant-Tech
Ant-Tech

Site Reliability Engineer (SRE) Terminal Infrastructure

Employee
System and Network Administration

Location

  • New York City, USA (On-site)
  • Exceptional candidates based in London, UK may also be considered

💰 Salary

Competitive + Equity/Token Package

About the Role

We are seeking an experienced Site Reliability Engineer (SRE) to build, operate, and scale highly available, mission-critical infrastructure supporting a fast-growing global technology platform.

This is a hands-on role for an infrastructure expert who thrives in high-scale, high-availability environments and enjoys solving complex operational challenges end-to-end.

Tasks

Key Responsibilities

  • Design, build, and maintain highly available infrastructure and platform services.
  • Implement and manage Infrastructure-as-Code (IaC) solutions.
  • Develop automation tooling and operational systems using Python, Go, or similar languages.
  • Operate multi-region, high-availability environments with strong reliability standards.
  • Lead incident response, on-call rotations, root cause analysis, and postmortems.
  • Manage cloud-native infrastructure across AWS and/or GCP.
  • Build and optimize Kubernetes-based platforms and containerized workloads.
  • Collaborate with engineering teams to improve system reliability, scalability, and operational excellence.
  • Evaluate infrastructure risks and make sound technical trade-offs under pressure.

Requirements

Requirements

Must-Have Skills & Experience

  • Extensive experience as a Site Reliability Engineer, Infrastructure Engineer, or Platform Engineer.
  • Deep expertise in Infrastructure-as-Code (Terraform, OpenTofu, or equivalent).
  • Strong understanding of networking, distributed systems, and cloud infrastructure.
  • Experience managing production systems in highly available, multi-region environments.
  • Hands-on expertise with:

  • AWS and/or GCP
  • Kubernetes
  • Infrastructure automation
  • CI/CD pipelines

  • Proficiency in Python, Go, or similar languages for operational tooling and automation.
  • Experience participating in on-call rotations and incident management processes.
  • Strong problem-solving skills with the ability to independently drive solutions from concept to production.

Nice-to-Have

  • Infrastructure security and compliance experience (SOC2, ISO, IAM architecture).
  • Experience with high-throughput data and streaming platforms such as Kafka, Redpanda, or PostgreSQL.
  • Familiarity with Web3, crypto, or digital asset infrastructure environments.
  • Track record of mentoring engineers and improving operational standards across teams.

What We're Looking For

  • Infrastructure-first mindset.
  • Strong ownership and accountability.
  • Ability to navigate ambiguity and make pragmatic decisions.
  • Experience building foundational systems and scaling platforms.
  • Excellent communication and collaboration skills.
  • Passion for reliability, automation, and operational excellence.

Benefits

Benefits

  • Competitive compensation package
  • Equity and/or token participation
  • Opportunity to work on large-scale, globally distributed systems
  • High-impact engineering environment
  • Work alongside experienced infrastructure and platform engineers

Visa Sponsorship

Visa support may be available for exceptional candidates, including H-1B transfers and potentially TN visas for qualified Canadian citizens.

Updated: 12 hours ago
Job ID: 16279033
Report issue

Ant-Tech

11-50 employees
IT Services and IT Consulting

Ant-tech is a reputable headhunter agency in France, specializing in providing high-quality recruitment services for companies across various industries. With a team of experience…

Read more
  1. Site Reliability Engineer (SRE) Terminal Infrastructure