Senior Staff Engineer - Public Cloud Operations (AWS/Alibaba)

OKX

OKX

Operations
Singapore
Posted on Apr 30, 2025
OKX will be prioritising applicants who have a current right to work in Singapore, and do not require OKX's sponsorship of a visa.

Who We Are

At OKX, we believe that the future will be reshaped by crypto, and ultimately contribute to every individual's freedom. OKX is a leading crypto exchange, and the developer of OKX Wallet, giving millions access to crypto trading and decentralized crypto applications (dApps). OKX is also a trusted brand by hundreds of large institutions seeking access to crypto markets. We are safe and reliable, backed by our Proof of Reserves. Across our multiple offices globally, we are united by our core principles: We Before Me, Do the Right Thing, and Get Things Done. These shared values drive our culture, shape our processes, and foster a friendly, rewarding, and diverse environment for every OK-er. OKX is part of OKG, a group that brings the value of Blockchain to users around the world, through our leading products OKX, OKX Wallet, OKLink and more.

About The Opportunity

We are currently seeking a Senior Staff Engineer to join our Singapore team. You be will responsible for the full lifecycle management of enterprise-level hybrid cloud infrastructure, leading unified orchestration, operations, and cost optimisation of AWS and Alibaba Cloud resources, ensuring high availability, high performance, and compliance.

What You’ll Be Doing

  1. Cloud Platform Architecture & Operations
    1. Plan, deploy, monitor, and maintain AWS services (EC2, S3, VPC, Lambda, EKS, etc.) and Alibaba Cloud services (ECS, OSS, VPC, Function Compute, ACK, etc.).
    2. Design highly available, auto-scaling cloud architectures, optimizing network (e.g., Alibaba Cloud CEN, AWS Direct Connect), storage, and compute resource configurations.
  2. Monitoring & Incident Management
    1. Implement full-stack monitoring and alerting using cloud-native tools (AWS CloudWatch, Alibaba Cloud CloudMonitor) and open-source solutions (Prometheus+Grafana, ELK).
    2. Lead critical incident response, perform root cause analysis, and implement preventive measures (e.g., resource contention, misconfigurations, network latency).
  3. Cost Optimisation & Resource Management
    1. Analyse cloud resource usage, reduce costs via reserved instances, auto-scaling, and storage lifecycle policies (e.g., AWS S3 Intelligent-Tiering, Alibaba Cloud OSS Archive).
    2. Establish resource quota management strategies to prevent waste and overspending.
  4. Security & Compliance
    1. Implement cloud security baselines (security groups, IAM policies, Alibaba Cloud RAM permissions, AWS Security Hub), conduct regular security audits, and remediate vulnerabilities.
    2. Design granular access controls using AWS IAM and Alibaba Cloud RAM, and enforce database auditing (e.g., AWS CloudTrail + Alibaba Cloud DAS).
  5. Cross-Team Collaboration & Knowledge Sharing
    1. Collaborate with development teams to optimize application architectures and provide cloud-native solutions (Server-less, Microservices).
    2. Document operational procedures (SOP manuals) and lead internal technical training sessions.

What We Look For In You

  1. Technical Skills
    1. Mastery of core services (compute/storage/network/security) on AWS or Alibaba Cloud, with familiarity in the other platform.
    2. Proficient in Linux/Windows system operations and automation tools (Shell/Python/Ansible).
    3. Hands-on experience with containerized operations (Kubernetes, ECS/EKS, ACK) and cloud-native technologies (e.g., Service Mesh).
  2. Experience Requirements
    1. 5+ years of operations experience, with at least 3 years focused on public cloud (AWS/Alibaba Cloud) environments managing 100+ instances.
    2. Experience in building cloud platforms from scratch, hybrid cloud architecture design, or large-scale migration projects (e.g., IDC-to-cloud) is preferred.
  3. Soft Skills
    1. Strong problem-solving skills with the ability to handle high-pressure operational challenges.
    2. Excellent communication skills to collaborate with development, testing, and security teams.
  4. Certifications & Education
    1. AWS Certified SysOps Administrator or Alibaba Cloud ACP/ACE certifications are preferred.
    2. Bachelor’s degree or higher in Computer Science, Network Engineering, or related fields.

Preferred Skills

  • Familiarity with multi-cloud management platforms (AWS, Alibaba Cloud, Azure) or FinOps cost optimisation methodologies.
  • Experience in cloud security practices, including Web Application Firewall (WAF) and DDoS protection (Alibaba Cloud Anti-DDoS Premium, AWS Shield).
  • Exposure to big data/AI operations (e.g., Alibaba Cloud MaxCompute, AWS EMR).
  • Team leadership experience is preferred.

Perks & Benefits

  • Competitive total compensation package
  • L&D programs and Education subsidy for employees' growth and development
  • Various team building programs and company events
  • Wellness and meal allowances
  • Comprehensive healthcare schemes for employees and dependants
  • More that we love to tell you along the process!