Infrastructure Platform Support Engineer

Job Category: Technology and IT
Job Type: Contract
Job Location: USA
Salary: $50.00/hr
Company Name: eSOL (Japan)

Position Overview

We are seeking an experienced and highly motivated Infrastructure Platform Support Engineer with a strong background in cloud infrastructure, automation, and platform support, particularly in the context of Generative AI technologies. In this role, you will serve as a critical member of our platform operations team, focusing on the stability, performance, and continuous improvement of our AI/ML infrastructure. The ideal candidate will bring a combination of infrastructure support experience, scripting and automation skills, and hands-on exposure to modern cloud platforms and tools.

You will be responsible for ensuring the smooth operation of our Generative AI platform by identifying and resolving issues, supporting deployment activities, and enhancing the system’s fault tolerance and scalability. This position requires close collaboration with data science, DevOps, and engineering teams, as well as a proactive approach to performance optimization and knowledge sharing.


Key Responsibilities

  • Platform Resilience & Optimization

    • Evaluate and enhance the resilience and reliability of the AI platform’s data pipelines.

    • Ensure AI/ML model training and inference processes are fault-tolerant and capable of scaling efficiently.

    • Identify and eliminate performance bottlenecks in AI model execution and data flow.

  • Operational Support & Troubleshooting

    • Provide end-to-end technical support for users of the Generative AI platform, resolving issues and responding to inquiries.

    • Monitor platform logs and system health metrics to proactively identify and address operational concerns.

    • Collaborate with DevOps engineers to implement deployment best practices and streamline release workflows.

  • Automation & Deployment

    • Develop and maintain Terraform scripts for provisioning and managing cloud infrastructure services on AWS and Azure.

    • Automate testing procedures to ensure system stability during fault scenarios and version upgrades.

    • Support Kubernetes-based deployments, including services hosted on EKS (Elastic Kubernetes Service).

  • Documentation & Continuous Improvement

    • Maintain thorough documentation of platform support processes, common issues, resolutions, and configuration best practices.

    • Stay informed on advancements in Generative AI, cloud services, and infrastructure automation tools.

    • Contribute to the continuous improvement of support tools, processes, and operational runbooks.


Required Qualifications

  • Bachelor’s degree in Computer Science, Engineering, or a related technical discipline.

  • Minimum 5 years of experience in infrastructure platform support, with hands-on involvement in troubleshooting and maintaining production environments.

  • At least 3 years of experience using Terraform for infrastructure provisioning and automation.

  • Minimum 5 years of hands-on Python development and scripting experience.

  • Strong background in shell scripting and automation of system administration tasks.

  • Proven experience in deploying, supporting, and troubleshooting cloud-native applications on AWS, Azure, and OpenShift platforms.

  • Direct experience with Kubernetes, including monitoring and troubleshooting services deployed on Amazon EKS.

  • Familiarity with database technologies such as SQL and PostgreSQL.

  • In-depth understanding of DevOps practices, CI/CD pipelines (especially with Jenkins), and infrastructure requirements.

  • Working knowledge of OpenAI tools and concepts related to Generative AI platforms.

  • Ability to provision and configure Generative AI-related cloud services in AWS and Azure environments (preferred).


Professional Skills

  • Excellent analytical and problem-solving abilities with attention to detail.

  • Strong verbal and written communication skills.

  • Ability to collaborate across teams in a fast-paced, agile environment.

  • Self-starter with a proactive mindset and eagerness to stay ahead of emerging technologies.


Benefits

  • Comprehensive health insurance coverage

  • Structured 8-hour workday schedule

  • Opportunity to work with cutting-edge technologies in the rapidly growing field of Generative AI

APPLY

Apply for this position

Allowed Type(s): .pdf, .doc, .docx