AMAX is an IT infrastructure and engineering company that designs, customizes, manufactures, and deploys advanced computing solutions for AI, HPC, and enterprise use cases. Across servers, workstations, rack systems, and OEM infrastructure, the company helps customers build computing environments that are scalable, high-performing, and reliable.
As a Hardware Solutions Engineer, you will help bridge customer requirements with engineering execution, shaping customized server and infrastructure solutions that support demanding AI, HPC, and enterprise workloads. This role sits at the intersection of solution design, technical validation, customer engagement, and cross-functional delivery, with a strong focus on turning complex requirements into production-ready hardware platforms.
Responsibilities
-
Design, architect, and implement high-performance computing (HPC) and AI/ML infrastructure solutions, including GPU-accelerated clusters, high-speed storage, and low-latency networking.
-
Lead the end-to-end solution lifecycle from pre-sales architecture through system integration, validation, and deployment in customer environments.
-
Collaborate with sales and customers to translate requirements into scalable AI/HPC system designs, including compute, GPU topology, interconnect (InfiniBand/Ethernet), and storage architectures.
-
Drive OEM server and appliance configuration, including BOM definition, firmware/BIOS tuning, thermal/power considerations, and manufacturability alignment.
-
Work closely with manufacturing and integration teams to ensure system-level validation, rack integration, burn-in testing, and production readiness for complex server appliances.
-
Deliver technical presentations, demos, and proof-of-concepts (POCs) focused on AI workloads (training/inference), HPC simulations, and data-intensive applications.
-
Support cluster bring-up and optimization, including OS provisioning, workload managers (e.g., Slurm), container platforms, and GPU software stacks (CUDA, drivers, AI frameworks).
-
Provide advanced troubleshooting and performance tuning across compute, GPU, storage, and networking subsystems.
-
Perform on-site or remote deployment support, including rack-level integration, cluster commissioning, and acceptance testing.
-
Develop and maintain technical documentation, including solution architectures, integration guides, and manufacturing/test procedures.
-
Interface with cross-functional teams (engineering, manufacturing, support) to resolve complex system-level issues and improve product quality.
-
Stay current with emerging technologies in AI infrastructure, HPC architectures, GPU platforms, and data center design.
Requirements
-
Bachelor’s degree in Computer Science, Electrical Engineering, or related field (or equivalent experience).
-
Hands-on experience with HPC clusters, AI/ML infrastructure, or GPU-accelerated systems.
-
Strong knowledge of server architecture (x86/ARM), CPU/GPU platforms (NVIDIA/AMD), memory, storage, and networking.
-
Experience with high-speed interconnects (InfiniBand, RDMA, NVLink, high-performance Ethernet).
-
Familiarity with OEM server platforms and system integration, including BIOS/BMC, firmware, and hardware validation.
-
Understanding of manufacturing and system integration processes (rack integration, burn-in, QA, and deployment workflows).
-
Experience with Linux environments, scripting (Bash/Python), and system automation tools.
-
Strong analytical, troubleshooting, and performance optimization skills across the full hardware and system stack.
-
Experience working across customers, sales, and engineering/manufacturing teams.
Preferred Skills
-
Knowledge of AI/HPC software stacks such as CUDA, Kubernetes, Slurm, Docker, and AI frameworks like PyTorch/TensorFlow.
-
Experience designing or deploying AI training clusters or large-scale HPC environments.
-
Exposure to liquid cooling, high-density rack design, or power/thermal optimization.
-
Familiarity with OEM/ODM workflows and contract manufacturing environments.
-
Experience with benchmarking and performance tuning (e.g., MLPerf, IO benchmarks, MPI workloads).
Benefits