Menu
micro1.

Systems & Infrastructure Specialist

micro1.
Featured
contract remote mid

$40 – $70/hr

Job Description

Job Title: Systems & Infrastructure Specialist


Job Type: Contractor


Location: Remote


Job Summary:

Join our customer's team as a Systems & Infrastructure Specialist for a high-intensity, expert-level project focused on training and optimizing AI models within intricate, containerized environments. In this terminal-intensive role, you'll apply a systems-first mindset to solve complex infrastructure challenges in real time. This one-time project offers significant opportunities for extension or transition into future phases for those who demonstrate elite technical execution.


Key Responsibilities:

• Navigate, troubleshoot, and recover dynamic infrastructure and long-running processes in real-time using command-line tools.

• Master and manage highly containerized environments, including orchestrating Dockerized sandboxes and CI/CD workflows.

• Build, maintain, and optimize systems for AI model training and high-throughput compute environments.

• Respond swiftly to system errors, executing dynamic mid-operation replanning and recovery.

• Collaborate with engineering and AI teams to ensure seamless integration, reliability, and performance.

• Document system architectures, incident responses, and recovery protocols with meticulous clarity.

• Contribute expertise to evolving project needs, adapting to new technologies and scaling strategies as required.


Required Skills and Qualifications:

• Demonstrated expert proficiency working in terminal environments for system builds, server administration, and infrastructure management.

• Advanced problem-solving skills for multi-step troubleshooting, filesystem navigation, and process management within containerized settings.

• Hands-on experience with Python, Bash, JavaScript/TypeScript, Go, Rust, and/or C/C++.

• Deep familiarity with build systems, package managers, databases, web servers, ML frameworks, version control, and cryptography tools.

• Proven ability to execute dynamic infrastructure recovery and optimize long-running processes under pressure.

• Strong written and verbal communication skills, with a passion for precise technical documentation.

• Systems multilingualism: versatility across operating systems, languages, and emerging DevOps tools.


Preferred Qualifications:

• Prior experience in high-compute environments for AI/ML workloads.

• Background in Site Reliability Engineering or DevOps roles focused on mission-critical infrastructure.

• Familiarity with advanced container orchestration and distributed system design.