Careers at VizopsAI

The Secure Runtime for Custom Enterprise Software

About Us

VizopsAI is the secure runtime for custom enterprise software. We provide the production layer that turns AI-generated internal tools into compliant, hardened applications — wrapping raw AI code in enterprise-grade identity, security, and infrastructure best practices. We're a lean, fast-moving team building the industrialization layer for the AI app revolution. We're an early stage venture-backed AI-native startup based in the SF Bay Area. The founding team combines deep AI/ML research leadership at Google DeepMind, Amazon Alexa, and Oracle Cloud with AI Product leadership at Verkada, AWS and Sony. Technical leadership includes PhDs from Johns Hopkins specializing in deep learning and optimization. We already have multiple customers locked in and are bringing on rockstars to build the infrastructure that makes enterprise AI adoption safe and scalable.

Why Join VizopsAI

Work at the frontier of enterprise AI infrastructure — identity, security, and runtime hardening
Ship production systems that enterprises trust with their most sensitive workloads
Small team, massive impact, no corporate bureaucracy
Learn from PhDs and leaders from Google DeepMind, Amazon Alexa, Verkada
Solve hard problems across authorization, network security, hybrid cloud, and AI optimization

Benefits

Competitive salary and generous equity
Healthcare coverage
Learning budget for conferences and courses
Work directly with founders and customers

Our Ethos

Shipping over polishing — learn fast, iterate faster
Bridge research and engineering — every experiment becomes real software
Own what you build end-to-end — no silos, no handoffs
Curiosity over credentials, action over consensus
Lean, transparent, and ambitious

Open Positions (6)

Founding AI Infrastructure Engineer

Bay Area, CA · Full-time · $150k-$250k + equity

Distributed SystemsKubernetesML SystemsPythonPerformance

You'll own the engineering foundations for taking AI-powered workflows (including agentic systems) from prototype to reliable production. This is a high-ownership, hands-on role spanning distributed systems, ML systems, Kubernetes/cloud infrastructure, performance engineering, and enterprise-ready platform design. You'll build core platform capabilities (execution/runtime, data/model pipelines, APIs, observability, scaling) and ensure the system is fast, resilient, and easy to integrate into a customer's existing stack.

What You'll Do

Design and build production services for AI/ML workloads with clear SLOs (latency, throughput, availability), including synchronous request paths and safe fallbacks
Build a sandboxed execution runtime for running customer-provided or semi-trusted logic safely at scale (isolation boundaries, cold-start mitigation, warm pools, resource governance)
Build and operate large-scale data + embedding/model pipelines (batch processing, feature generation, training data preparation, serving-friendly formats)
Architect event-driven systems using Kafka-style streaming for ingestion, replayability, and decoupling offline pipelines from latency-sensitive online services
Own Kubernetes deployments end-to-end: rollouts (blue/green, rolling), autoscaling, networking (ingress, gateways, service-to-service), resource tuning, and on-call grade debugging (OOM, crash loops, throttling)
Build platform-grade observability: metrics, logs, traces, dashboards, alerting, and incident runbooks; instrument application-level profiling for memory/GC and performance regressions
Implement multi-tenant API controls: API key management, quotas, rate limiting (token/leaky bucket), request scheduling/fairness, backpressure, and retry strategies with jitter
Drive performance optimizations across the stack (parallelization, serialization/I/O bottlenecks, caching/batching), and rewrite hotspots when needed
Build enterprise-ready integrations-first product surfaces (fit into Datadog/ServiceNow/identity/logging workflows instead of "one more dashboard")
Partner with product and customers to translate ambiguous requirements into robust, developer-friendly platform primitives

What We're Looking For

Strong experience building production distributed systems (service design, reliability patterns, scaling strategies, failure modes)
Proven track record in performance engineering (profiling, concurrency/parallelism, bottleneck elimination, cost/perf tradeoffs)
Deep hands-on experience with Kubernetes in production (deployments, networking, autoscaling, debugging, observability)
Solid cloud fundamentals (AWS/GCP/Azure): compute lifecycle automation, networking, IAM basics, cost controls, and operational tooling
Experience designing low-latency APIs and making pragmatic tradeoffs between latency, availability, and consistency
Familiarity with streaming/event systems (Kafka/event sourcing) and building pipelines that support replay and auditability
Strong programming skills (Python plus at least one systems language like Go/Rust/C++ is a plus)
Comfortable operating in a fast-moving startup environment: clear communication, high ownership, and good engineering judgment
4+ years experience in backend/platform/infra roles (or equivalent depth)

Nice to Have

Experience building sandboxed runtimes or isolation layers (microVMs, gVisor, containers, secure execution boundaries)
Built large-scale embedding/recommendation or retrieval pipelines and served them in production
Experience with multi-tenant platform concerns: noisy neighbor mitigation, quota enforcement, fairness scheduling, per-tenant observability
Strong opinions on enterprise integrations and "platform adoption" mechanics (connectors-first, workflow-native design)
Experience implementing safe progressive delivery for ML-backed systems (shadowing, canarying, rollback automation, regression gating)
Experience with infrastructure-as-code (Terraform/CDK), CI/CD, and production readiness practices from scratch

Founding AI Engineer

Bay Area, CA · Full-time · $120k-$250k + equity

PythonRLLLMsProduction ML

You'll work across the stack to design, train, and deploy RL-driven optimization loops for AI agents. You need to be deeply technical, hands-on, and comfortable moving between research, ML engineering and production (APIs, infra, SLAs).

What You'll Do

Design RL loops for multi-step, tool-using agents (planning, retrieval, coordination)
Build backend services for training, evals, and online policy updates
Train reward models from traces/preferences; run A/Bs & interleavings safely
Scale distributed training/serving
Turn papers → running code → measurable uplift
Partner with product & customers: translate messy, real-world objectives into measurable rewards and robust policies

What We're Looking For

Strong programming in Python; bonus for TypeScript/Go/Rust for systems and APIs
LLM systems intuition including tool-use, planning, retrieval, structured outputs, and how evals/telemetry become learning signals
You've shipped production systems and can debug/profile at speed
Experimentation mindset: design clean evals, run ablations, read papers, and turn them into maintainable code
Data & infra fluency: event/trace pipelines, schema design, reproducibility, and versioning for datasets, policies, and rewards
Clear communication in a fast, ambiguous environment
2+ years experience building ML systems

Nice to Have

RL proficiency with some real-world RL applications
Experience with agent stacks: orchestration graphs, tool routers, retrievers, evaluation frameworks, and observability of traces
LLM post-training experience: reward-modeling, preference data collection, safety/guardrail integration, structured evals

Founding Member of Technical Staff (MTS)

Bay Area, CA · Full-time · $150k-$220k + equity

PythonBackendML SystemsProduction

As a Member of Technical Staff (MTS), you'll build production-grade systems that power continuous optimization loops for AI agents—from evaluation pipelines and data/trace infrastructure to APIs that deploy improved policies. This role is a blend of MLE + backend engineering with a strong customer empathy component. You'll partner closely with customers and products to translate real-world objectives (accuracy, latency, cost, safety) into measurable signals and reliable services. **Note:** Unlike our Founding AI Engineer role, there's no expectation to read/implement research papers from scratch—the bar for engineering rigor, ownership, and ambiguity-handling remains high. You'll need to demonstrate clear communication and high ownership in a fast, evolving environment.

What You'll Do

Build backend services for training, evals, telemetry, and online policy updates
Instrument observability to make optimization loops inspectable and reliable
Translate customer KPIs into reward signals, guardrails, and success metrics
Collaborate with product & customers to reduce time-to-uplift and land measurable improvements in production
Scale distributed workloads for training/serving. Improve reliability, cost, and latency over time

What We're Looking For

Strong programming in Python; comfortable with backend systems
You've shipped production systems and can debug other people's code
Fluency with data & infra - Containerization (Docker/K8s), cloud (GCP/AWS)
2+ years experience building ML or backend systems

Nice to Have

Experience with RL, reward modeling, LLM evals, or agent stacks (retrievers, tool routers, orchestration)
Familiarity with vLLM, LangSmith/Langfuse, SkyRL, Verl, Llama Factory, Agent Lightning
LLM post-training exposure (preference data collection, safety/guardrails, structured evals)

AI Research Intern

Bay Area, CA · Internship · Competitive internship compensation

PythonRLLLMsResearch

Join our team to work on cutting-edge RL research for AI agents. You'll contribute to designing and testing RL-driven optimization loops while learning from experienced researchers and engineers. This is a hands-on role where you'll ship real code and see direct impact.

What You'll Do

Assist in designing RL loops for multi-step, tool-using agents
Help build and improve training pipelines and evaluation frameworks
Run experiments with reward models and policy training
Contribute to turning research papers into working prototypes
Collaborate with the team on improving agent performance metrics
Document experiments and share learnings with the team

What We're Looking For

Proficiency in Python for ML/RL work
Familiarity with LLMs and how they're used in agent systems
Understanding of ML fundamentals and basic RL concepts
Ability to read and implement ideas from research papers
Strong coding skills and attention to detail
Enthusiasm for learning and working in a fast-paced environment
Currently pursuing or recently completed degree in CS, ML, or related field

Nice to Have

Prior internship or project work with RL
Experience with LLM APIs or agent frameworks
Contributions to open source ML projects
Familiarity with evaluation frameworks and observability tools

Chief of Staff

Remote or Hybrid · Full-time · Competitive salary + equity

StrategyOperationsGTMLeadership

Serve as a force multiplier to the CEO at the intersection of strategy, operations, and execution. This is a strategic leadership role (not administrative) where you will own initiatives end-to-end and help scale the company. You will work directly with founders who have led teams at Google DeepMind, Amazon Alexa, and Verkada.

What You'll Do

Drive strategic priorities and lead cross-functional special projects critical for growth
Build and refine operating rhythms (OKRs, board, all-hands) to ensure alignment and speed
Support fundraising, investor relations, and key partnerships to accelerate GTM
Remove bottlenecks across hiring, finance, and product operations so teams can ship
Bridge research, engineering, and business to prevent silos and keep execution tight

What We're Looking For

4+ years in management consulting, investment banking, VC, or high-growth startup operations
High agency and bias to action; proactive problem-solver who moves without waiting for permission
Structured thinker who can turn ambiguity into actionable plans
Clear written and verbal communication; comfortable context switching across domains
Versatile operator who can engage on RL/AI topics and financial/operational models

Nice to Have

Prior chief of staff / founder associate experience
Exposure to AI/ML or technical teams

Founder Associate, Office of the CEO

Remote or Hybrid · Full-time · Competitive salary + equity

GeneralistResearchGTMOperations

A “learn by doing” role embedded in the Office of the CEO. You’ll wear many hats—market research, sales outreach, product testing, and ops—to help build a category-defining company from the inside out.

What You'll Do

Execute critical operational tasks that keep founders focused on product and engineering
Drive market intelligence on the AI agent landscape, customers, and competitors
Support onboarding of early design partners and gather feedback into the product loop
Draft memos, decks, and technical content to articulate the vision
Jump into whatever is needed to unblock growth (“no task too small”)

What We're Looking For

0–2 years experience; recent grads or early-career candidates with a track record of excellence
Hungry, ambitious, and comfortable with high intensity and ambiguity
Fast learner and generalist; able to pick up new tools and concepts quickly
Detail-oriented with strong ownership; nothing falls through the cracks
Genuine passion for AI and excitement about agent optimization

Nice to Have

Prior startup or founder associate experience
Exposure to CRM, sales ops, or basic data analysis