About OLIX
AI is growing faster than any technology in history and the explosion in demand has created a massive infrastructure gap; we can no longer build chips or power stations fast enough to keep up. The industry is still leaning on a ten-year-old hardware blueprint that has reached its limit. A new paradigm that is faster and more efficient will be the biggest economic opportunity of the next century and create the most important company of the next decade. The OLIX Decode Accelerator 1 (DX-1) is the first accelerator architected specifically for decode. Rack-scale co-design of logic, data movement, packaging, optics and interconnect enables a step change in system level performance.
The Role
As an Architect, Staff and/or Senior Software Platform Integration Engineer, you will be the technical authority on how OLIX serves large models as hyperscale AI infrastructure - spanning distributed inference engines, serving-runtime integration, KV cache and memory hierarchy, and the orchestration and networking layers that make serving real.
We are looking for experienced Architect, Staff, Principal & Senior-level engineers who have shipped distributed inference at scale and have strong opinions about how modern serving stacks - vLLM, SGLang, NVIDIA Dynamo - should be extended onto novel accelerators. You will partner closely with leadership and cross-functional engineering teams to set the technical direction for distributed inference on DX-1, define the architectural contracts the rest of the platform builds against, and make the hard technical calls across the serving stack. You bring rare depth across the full stack, the judgment to know what matters and why, and the influence to drive alignment across engineering without relying on authority.
Responsibilities
Shape the technical vision. Partnering with leadership to set long-term technical direction across serving-engine integration (vLLM, SGLang, NVIDIA Dynamo), disaggregated prefill/decode, KV cache management (NIXL / Mooncake TE), cluster orchestration, fleet management, networking, and deployment - and own the architectural integrity of that vision across the full platform lifecycle.
Translate strategy into architecture. Work with cross-functional partners to turn long-term business direction into concrete architectural priorities, and identify where technical investments will have the highest leverage
Set the architectural bar. Define the principles, interface contracts, and standards the organisation builds to - across scheduling, fleet operations, ingress/egress, and platform management - and ensure they hold across teams.
Make the hard calls. Own the technical decision-making across the platform stack: orchestration and scheduling architecture, fleet management systems, networking design, and deployment strategy.
Lead through influence. Drive alignment across teams without direct authority - through rigour, clarity, and the quality of your technical thinking.
Raise the technical ceiling. Mentor and stretch engineers across the organisation - not as a manager, but as a technical leader who holds the bar high and helps others reach it.
Skills & Experience
Deep expertise in distributed inference infrastructure (vLLM, SGLang, Nvidia Dynamo) as well as associated networking (NCCL, RoCE, Infiniband) and KV cache management (NIXL, Mooncake TE) technologies, and rail optimisation to link up accelerator clusters.
Deep expertise in cluster management at hyperscale on bare-metal, custom-accelerator fleets - provisioning, scheduling, and lifecycle ownership across thousands of nodes, including safe firmware update orchestration rolled out at fleet scale without compromising production SLOs.
Track record driving technical outcomes in high-reliability production inference environments: latency and throughput SLOs, capacity and cost modelling, observability, incident management, and security at scale across fleets of accelerators.
Full lifecycle experience from early architecture through to production operations and long-tail reliability.
Outstanding technical communicator. You articulate architectural decisions clearly to engineers, managers, and senior leadership alike, and write design thinking that becomes the organisational reference point.
Compensation & Equity
Competitive Salary: Commensurate with your experience, skills, and location
Equity & Ownership: Meaningful stock options. You’re not just joining the mission; you’re owning a piece of it
Proximity Bonus: We value your time. To minimise your commute and maximise your life, we offer an annual Living-Local Bonus if your residence is within 20 minutes of the office
Retirement Benefits: Employer-contributed retirement plans to help you build long-term financial security.
Due to U.S. export control regulations, candidates’ eligibility to work at OLIX depends on their most recent citizenship or permanent residency status. We are generally unable to consider applicants whose most recent citizenship or permanent residence is in certain restricted countries (currently including Iran, North Korea, Syria, Cuba, Russia, Belarus, China, Hong Kong, Macau, and Venezuela). Applicants who have subsequently obtained citizenship or permanent residency in another country not subject to these restrictions may still be eligible.
