AI is not a side project for us — it's the core of what we're building toward. We design and deploy agentic AI systems, LLM-powered workflows, and machine learning infrastructure for enterprise clients across multiple industries. This role exists because we're at the stage where client demand for this kind of work has outpaced what the current team can deliver well. We need someone who has actually shipped AI systems into production — not just notebooks, not just demos — and who understands what it takes to make those systems reliable, maintainable, and genuinely useful for the people depending on them. The work is technically challenging and the landscape is moving fast. You'll need to keep up with it and bring that knowledge back into how we build.
What you'll do
- Design and deploy LLM-powered systems and autonomous AI agents that go into production environments, not proofs of concept that never ship
- Build and maintain retrieval-augmented generation (RAG) pipelines including document processing, vector database management, embedding strategies, and retrieval optimisation
- Work with clients and internal teams to define the right technical approach for AI-augmented workflows — which means being honest when LLMs are the wrong tool and proposing alternatives
- Fine-tune and evaluate language models for domain-specific applications, including building evaluation frameworks that measure what actually matters rather than just benchmark scores
- Build the infrastructure that keeps AI systems observable in production — logging, monitoring, cost tracking, and failure detection for pipelines that are often probabilistic by nature
- Contribute to internal frameworks and tooling that make it faster and more reliable to spin up AI systems for new clients without starting from scratch each time
- Stay genuinely current on the state of the field — new model releases, emerging frameworks, research that matters — and translate that into decisions about what we adopt and when
What we're looking for
- Three or more years of applied machine learning or AI engineering experience, with clear examples of systems that shipped to real users in production
- Hands-on experience building with LLM APIs (OpenAI, Anthropic, or similar) and agent frameworks like LangChain, LlamaIndex, CrewAI, or equivalents
- Strong Python skills — not just writing scripts, but structuring maintainable ML codebases, managing dependencies, and thinking about performance
- Real experience with vector databases (Pinecone, Weaviate, Chroma, or similar) and a solid understanding of embedding models and semantic search
- Understanding of how to make AI systems production-ready — latency management, cost control, fallback handling, and monitoring that actually works
- Familiarity with fine-tuning workflows and the trade-offs between fine-tuning, prompt engineering, and RAG for different use cases
- Experience working with enterprise clients or in a consulting context is a significant advantage, particularly if it involved navigating data privacy requirements