Enterprise GenAI platform architecture Layered platform. Top tier shows two use case classes: agentic use cases (which flow through the Agent Service Layer) and direct GenAI use cases (which call the GenAI Service Layer directly). The Agent Service Layer contains agent runtime, agent memory (session, short-term, long-term), agent auth for attended and unattended flows, tools and APIs and MCPs registry, multi-agent orchestration, planner and reasoning, human-in-the-loop, and trace and replay and eval. The GenAI Service Layer foundation contains smart routing, multi-cloud abstraction, caching, streaming and structured output, RAG and vector databases, and model registry. Cross-cutting platform capabilities wrap everything: observability, cost governance, access control, policy and compliance, infra management, audit, safety, secrets management, data residency, SLA and reliability, and disaster recovery. AGENTIC USE CASES DIRECT GENAI USE CASES Agentic use case Agentic use case … many more GenAI use case GenAI use case … many more Agent Service Layer Agentic systems framework Agent Runtime execution · lifecycle · sandbox Agent Memory session · short-term · long-term Agent Auth attended + unattended flows Tools · APIs · MCPs registry · invocation · parallel Orchestration multi-agent · handoffs Planner & reasoning ReAct · reflection · plans Human-in-the-loop approvals · checkpoints Trace · replay · eval agent-level test harness GenAI Service Layer Model-agnostic foundation · multi-cloud · multi-vendor Smart routing provider · model · tier Multi-cloud · multi-vendor OpenAI · Anthropic · Google · AWS Caching & token mgmt prompt · response · cost optim Streaming & schema tokens · structured output RAG & VectorDB embeddings · retrieval · hybrid Model registry versioning · catalog · canary CROSS-CUTTING PLATFORM CAPABILITIES Observability & tracing Cost governance Access control & IAM Policy & compliance Infra & capacity mgmt Rate limiting & quotas Audit & lineage Safety & guardrails Secrets & data privacy Data residency SLA & reliability Disaster recovery
Use cases Agentic + direct GenAI · 100+ in production
Agent Service Layer Runtime · Memory (session, short, long) · Auth (attended + unattended) · Tools, APIs, MCPs · Orchestration · Planner · HITL · Trace & eval
GenAI Service Layer Smart routing · Multi-cloud · Caching · Streaming & schema · RAG & VectorDB · Model registry
Cross-cutting capabilities Observability · Cost · Access · Policy · Infra · Audit · Safety · Secrets · Residency · SLA · DR · Rate-limiting
Agentic flows go through the Agent Service Layer. Direct GenAI flows call the foundation directly. Cross-cutting capabilities apply to both.
Framework100% of GenAI traffic · 100+ use cases

GenAI Service Layer

When every team needs LLMs, you need infrastructure that turns chaos into reliability. A single managed surface for 100+ production use cases, multi-cloud, multi-vendor, model-agnostic.

  • Smart routing across providers and model tiers
  • Prompt + response caching for latency and cost
  • Observability, audit trails, and policy enforcement
  • Cost governance, quota management, and chargeback
  • Drop-in client SDKs for any internal team
  • Model-agnostic abstraction over OpenAI, Anthropic, Google, AWS, and more
Framework16w → 2w dev time

Agent Service Layer

Agents aren't a feature you ship once. They're a class of system that has to be built, evaluated, validated, and operated. Built atop the GenAI Service Layer, this framework standardises every step of that lifecycle.

  • Standardised agent harness, lifecycle, and orchestration
  • Reusable tool library with parallel/agentic call support
  • Built-in memory, retrieval, and state management
  • Human-in-the-loop checkpoints and approval flows
  • Evaluation harness with regression test suites
  • Full idea-to-prod pipeline (build, eval, validation, load test, deploy) in 2 weeks or less, vs 16 weeks before
Product85K queries/wk · 80%+ CSAT

Multi-agent customer chatbot

The first proof that the Agent Service Layer scales: a customer-facing chatbot handling 85,000+ queries per week, with CSAT lifted from ~50% to 80%+ over the program lifecycle.

  • Multi-agent orchestration with specialised sub-agents
  • Parallel tool calls and agentic delegation
  • Human-in-the-loop escalation on high-stakes flows
  • Persistent interaction + knowledge memory across turns
  • Multi-step transactional workflows (booking, modification, support)
  • Real-time knowledge retrieval from internal sources
PartnershipsOpenAI · Google · Anthropic · AWS · Salesforce

Industry partnerships

The major labs ship faster than anyone can keep up with. Active engagement and pre-release evaluations with industry-leading AI organizations turn that velocity into enterprise advantage: co-shaping roadmaps, evaluating frontier capabilities pre-release, and translating research into production. Headlined by an industry-first major-airline × OpenAI collaboration covering enterprise chatbot enhancement, multimodal customer servicing, and staff AI assistance, all built on the GenAI + Agent Service Layers.

  • OpenAI: Industry-first major-airline collaboration covering enterprise chatbot, multimodal customer servicing, and staff AI assistance.
  • Anthropic: Enterprise readiness collaboration around Claude in production, agentic systems, and safety-first deployment patterns.
  • Google: Frontier model access, multimodal capabilities, and platform-level integration via Vertex AI.
  • AWS: Bedrock-based multi-vendor model access, enterprise compliance, and infrastructure scaling.
  • Salesforce: Agentforce, Einstein, and customer-engagement-layer GenAI integration.
EngagementCross-divisional · C-suite reach

Internal engagements

The hardest part of enterprise AI is not the model, it's the people. Personally lead and design programs that move the organization from "GenAI curious" to "GenAI capable", spanning individual contributors to the C-suite.

  • Hands-on training and workshops for engineering teams
  • Upskilling programs for technical and product teams adopting GenAI
  • 1:1 coaching for senior engineers and architects on agentic design
  • Mentorship for early-career engineers and technical leads
  • Executive briefings and advisory for senior leadership and the C-suite
  • Train-the-trainer scaling across business divisions
Research repos · how to read this rowHorizontal primitives, not point solutions

Three building blocks for enterprise knowledge work.

The three repos below are not tied to a single business unit or use case. They are cross-domain primitives, each attacking a different failure mode in how organizations turn information into decisions:

  • DocIQ · structured content already on disk. Useful anywhere there are long, layout-bound documents: legal contracts, compliance and regulatory filings, engineering specs, audit responses, supplier and procurement records, technical manuals, policy and HR handbooks, incident and safety reports.
  • Tacit · expertise that never made it onto disk. Useful anywhere the most valuable knowledge lives in senior practitioners' heads: dispatcher and operations heuristics, frontline service nuances, engineer-led troubleshooting playbooks, regulatory interpretation, supplier-management know-how, retiring-expert succession capture.
  • SmartResearch · open questions across both. Useful anywhere a defensible, cited briefing is needed on something nobody has answered yet: market and competitive intelligence, regulatory landscape scans, tech and vendor scouting, policy research, executive memos, post-incident root-cause briefs.

Each is designed to be embedded across functions: commercial, operations, engineering, legal, compliance, HR, finance, safety. The frame holds whether the user is a frontline analyst, a domain expert, or an executive.

Research repoPrivate build · Multi-modal · Agentic

DocIQ

What's novel: high-accuracy enterprise document Q&A with no vector database and no embeddings. Built to handle the documents that ordinary RAG stacks fail on: long contracts, regulatory filings, technical specs with tables and diagrams.

  • VectorDB-less retrieval. BM25 plus keyword and regex search, with on-demand neighbour expansion. Cheaper, faster, debuggable, and reproducible. No embedding drift.
  • Page-level multi-modality. Each page is scored for visual complexity; complex pages (tables, charts, scans) are auto-screenshot and read back to the model, so layout-bound content does not get flattened.
  • Agentic planner-executor-verifier loop. The model can re-plan, expand search, or read full pages when initial retrieval is insufficient, rather than failing on a thin top-k window.
  • Configurable provenance, citation, and confidence reporting on every answer.
  • Inference-mode-driven budgets (Fast, Balanced, Thorough) tune iterations, top-k, screenshot caps, and final-answer length, end-to-end.
  • Where it lives: any function with long, layout-bound documents. Same engine serves legal contract review, compliance Q&A, engineering spec lookups, audit responses, procurement record search, and policy/HR navigation.
Research repoPrivate build · Local-first · Privacy-first

Tacit

What's novel: instead of indexing documents that already exist, Tacit elicits the expertise that nobody wrote down. Conversational interviews surface tacit knowledge from senior staff and convert it into structured, queryable knowledge graphs, fully on-device.

  • Elicitation, not ingestion. An interviewer agent runs structured conversations to pull out heuristics, edge cases, and "how I actually decide" knowledge that lives in people's heads, not their files.
  • Structured KG output, not just chunks. Extracted into typed entities, relations, and decision rules, queryable as a graph rather than as a vector blob.
  • Fully on-device LLM execution. No data leaves the machine, by construction. Privacy is an architectural property, not a policy.
  • Designed for regulated industries and sensitive expert domains where the most valuable knowledge is also the most unsharable.
  • Successor to the earlier LLMwiki experiment, this time built for retention rather than for retrieval.
  • Where it lives: any function whose best knowledge is unwritten. Retiring-engineer succession capture, dispatcher and operations heuristics, customer-service edge-case know-how, regulatory interpretation, supplier and partner-relationship intelligence.
Research repoPrivate build · Steerable · HITL at every stage

SmartResearch

What's novel: the inverse of Google's Deep Research. Where the incumbents run autonomously and hand you a finished report, SmartResearch is a fully steerable, human-in-the-loop pipeline where you approve, edit, regenerate, or jump back at every one of eight stages. The agent does not "decide for you" at any point.

  • Eight explicit checkpoints. Query intake, plan, sub-questions, source strategy, execution, findings, outline, report. Each one ends with the user verdict, not the agent's.
  • Versioned editable timeline. Every artifact is Pydantic-typed and version-tracked. Free backward navigation with stale-flagging, cascade prompts, and per-stage restore. Think Git for a research process.
  • Hybrid web plus internal-document research. Per sub-question, the system can route to native web search (with a user-controlled trusted-domain allowlist or open web) and to private documents via DocIQ in two modes: retrieval-only or full generated-answer.
  • Parallel execution with live progress. Sub-questions fan out, 3 to 5 in parallel, with auto-retry, real-time per-step progress, and an in-pane streaming final report.
  • Single model, transparent cost. All reasoning and synthesis route through one Sonnet instance via the GenAI Service Layer. Token and dollar split is surfaced per stage.
  • Where it lives: any function that produces defensible briefings on open questions. Market and competitive intelligence, regulatory landscape scans, tech and vendor scouting, policy research, executive memos, post-incident root-cause briefs.
Voice / MultimodalBeyond STT-text-TTS

Voice AI for internal staff operations

Legacy speech-to-text-to-speech pipelines compound errors and add latency at every stage. Native end-to-end streaming voice replaces them with real-time, low-latency, prosody-aware conversation for internal staff in high-stakes operational workflows.

AwardGold · Asian Design Awards 2024

AI-powered travel-planning experience

GenAI search + smart flight recommendation, built atop the GenAI Service Layer. Doubled customer satisfaction on AI search; the recommendation surface averages 2,000 queries/day with click-through up from 23.8% to 34.4% in three months.

Not exhaustiveOther active threads

More on the workbench.

The above is a snapshot of the most visible work, not the full surface. Other ongoing threads and topics include:

  • GenAI content authenticity: Provenance and synthetic-media detection via C2PA signing and SynthID-style watermarking, applied across internal documents, customer communications, and outbound assets.
  • Enterprise process redesign: Rewiring core operational processes end-to-end around data-driven decisions and AI-enabled pipelines, not bolting AI on top of legacy workflows.
  • Adversarial robustness and guardrails: Defence-in-depth against prompt injection, jailbreaks, data exfiltration, and model-targeted attacks. Continuous red-teaming, not one-off audits.
  • AI governance and policy operationalisation: Translating regulatory frameworks and internal policy (EU AI Act, NIST AI RMF, model-card and data-lineage requirements) into enforceable runtime controls at the platform layer.
  • Evals at enterprise scale: Reusable evaluation harnesses, regression suites, and trust telemetry that hold up across 100+ production use cases.
  • Cost and capacity economics: Multi-vendor capacity planning, tier-based model routing, and unit-economics modelling for GenAI workloads at production scale.
  • Knowledge graphs + LLM hybrid systems: Structured retrieval and reasoning over enterprise knowledge, complementing or replacing pure vector RAG where precision matters.
  • On-device and edge inference: Local-first GenAI for sensitive, regulated, or low-connectivity workflows where data cannot leave the device.

For a specific topic or collaboration not listed here, the fastest way to start a conversation is LinkedIn.