Spanner Omni: Google’s Distributed SQL Leaves the Cloud — What It Means for Enterprise Architecture and AI Workloads
For over a decade, Spanner was the database Google would not let you run. It sat inside Google’s private infrastructure, bound to custom hardware, atomic clocks, and a globally distributed filesystem called Colossus. Now Google has released Spanner Omni — a downloadable, self-managed version designed to run on VMs, Kubernetes, on-premises data centers, and even your laptop. This is not a managed cloud service re-packaged. It is an attempt to export one of the most sophisticated distributed databases ever built outside Google’s walls, without the infrastructure that originally made it possible.
TL;DR
- Omni is a downloadable Spanner engine that runs on-prem, multi-cloud, Kubernetes, and local machines.
- It replaces hardware-backed TrueTime and Colossus with software-based equivalents.
- Current public release is a 90-day developer preview, explicitly non-production, missing TLS, at-rest encryption, multi-role auth, backup/restore, and audit logging.
- Supports relational, graph, vector, key-value, full-text, and analytical processing in one engine.
- Direct competitors: CockroachDB self-hosted and YugabyteDB Anywhere for production self-managed distributed SQL; AlloyDB Omni for PostgreSQL-anywhere; Aurora DSQL for managed distributed SQL only.
- Commercial pricing does not exist publicly — contact Google for production licensing.
1. What Spanner Omni Is
Omni is Google’s effort to make the core Spanner engine runnable anywhere. The managed version of Spanner — launched publicly in 2017 — is a globally distributed, strongly consistent SQL database that uses Paxos replication, automatic sharding, and external consistency to serve transactional workloads across regions with single-digit millisecond latency. It is the system behind Google Ads, Google Play, Gmail metadata, and Google Photos. It was built around two pieces of infrastructure Google does not sell: TrueTime, a time-synchronization service backed by atomic clocks and GPS, and Colossus, a distributed filesystem that handles storage under the hood.
Spanner Omni removes both dependencies. Google replaced hardware-backed TrueTime with a software-based equivalent and re-implemented the storage layer so it no longer requires Colossus. The result is a containerized, binary, or Helm-chart distribution that runs on your infrastructure.
What makes this different from “just another PostgreSQL-compatible distributed database” is that Spanner Omni retains the original architectural model:
- Paxos-based synchronous replication across nodes for durability and consensus
- External consistency — the strictest transactional guarantee in production databases, stronger than serializable isolation alone
- Automatic resharding based on load and data size without manual partitioning
- Multi-model support — relational tables, graph structures, vector embeddings, key-value pairs, full-text search, and analytical queries in one engine
- GoogleSQL and PostgreSQL dialects, plus the Spanner Graph Language
The product page confirms Omni runs in single-server, single-zone, multi-zone, and multi-cluster topologies. You can deploy it on virtual machines, container platforms, Kubernetes clusters, on-premises servers, across multiple cloud providers, or locally for development. Google distributes it as container images, Helm charts, and standalone binaries.
Figure 1: Deployment topology options for Spanner Omni
+---------------------------------------------+
| Spanner Omni Engine |
| +---------+ +---------+ +---------------+ |
| | GoogleSQL | |PostgreSQL| |Spanner Graph| |
| | Dialect | | Dialect | | Language | |
| +----+----+ +----+----+ +-------+-------+ |
| +------------+------------+ |
| | |
| +----------v----------+ |
| | Paxos Replication | |
| | + Auto-Sharding | |
| +----------+----------+ |
| | |
| +---------------v--------------+ |
| | Software-based TrueTime + | |
| | Local Storage (No Colossus)| |
| +---------------+--------------+ |
+---------------------------------------------+
| | |
+----v----+ +-----v----+ +------v------+
| Single | | Multi- | | Kubernetes |
| Server | | Cluster | | / On-Prem |
| (Dev) | | (Prod) | | / Multi-Cloud|
+---------+ +----------+ +-------------+
2. Why Google Launched It Now
The timing is strategic, not arbitrary. Three market pressures converged:
Hybrid and multi-cloud resilience: Enterprises running critical systems no longer want single-cloud dependency. Regulatory requirements, cost optimization, and disaster-recovery planning are pushing workloads across providers and into private data centers. A database that only runs in Google Cloud is a strategic liability in procurement conversations.
Application portability: Organizations building long-lived systems want to avoid cloud lock-in at the data layer. If an application is architected around Spanner’s consistency model, moving it to AWS or Azure previously meant rewriting against a different database. Spanner Omni removes that friction.
On-premises modernization: Large enterprises with existing data center investments — financial services, telecom, government — need modern distributed databases without abandoning physical infrastructure they still pay for. Spanner Omni lets Google participate in those environments.
Air-gapped deployments: Defense, intelligence, and critical infrastructure operators run disconnected networks. Managed cloud services are impossible in these environments. A downloadable Spanner is the only way Google can serve this market.
Google’s launch post frames Spanner Omni as internally battle-tested at “millions of queries per second across petabytes of data.” The important caveat: this claim is based on Google’s own internal benchmark tests, not independently validated third-party workloads. It should be treated as indicative of architectural potential, not a guaranteed performance contract for your deployment.
3. Why This Is Technically Hard: TrueTime, Colossus, and the Consistency Problem
The deepest engineering story in Omni is not marketing. It is how Google attempted to replicate Spanner’s external consistency without the two proprietary systems that originally made it work.
External Consistency and Strict Serializability
Spanner’s signature guarantee is external consistency, which is stronger than standard serializable isolation. In a serializable database, transactions appear to execute in some sequential order. In an externally consistent database, that order also matches real time. If transaction A commits before transaction B starts, then every observer will see A’s effects before B’s. There are no anomalies, no causal inversions, no clock-skew windows where stale reads are possible.
This is not a marketing distinction. In financial trading, supply-chain provenance, inventory reservation, and multi-party contract systems, real-time causal ordering prevents categories of bugs that serializable isolation alone cannot.
Classic Spanner achieved external consistency through TrueTime, a time-synchronization service exposed as an API. TrueTime returns time intervals, not single timestamps. It guarantees that the actual current time lies within the returned interval. Spanner uses these intervals to assign commit timestamps and enforce that if one transaction finishes before another starts, the first receives a strictly lower timestamp. This eliminates the need for locking or validation during commit.
The Original Dependency Stack
Classic Spanner sat on top of two systems Google does not license:
- TrueTime — atomic clocks, GPS receivers, and a synchronization protocol between Google data centers providing bounded uncertainty intervals
- Colossus — a distributed filesystem handling replication, storage layout, and recovery under the database layer
Neither can be packaged into a downloadable product. So Spanner Omni required two replacements:
Software-based TrueTime equivalent: Google does not disclose the full algorithm, but the launch post explicitly states they replaced hardware-backed TrueTime with a software-based equivalent. In distributed systems literature, this typically means combining synchronized clock protocols (like HLC — Hybrid Logical Clocks, or NTP with tight bounds and failure detection) with careful timestamp assignment and commit-wait logic. The key challenge is maintaining bounded clock uncertainty without atomic clocks. If the uncertainty interval grows too large, commit latency increases. If it is underestimated, consistency violations become possible. Google’s claim is that their software equivalent maintains the same guarantees within acceptable latency bounds.
Local storage layer: Colossus handled cross-node storage replication and recovery. Spanner Omni replaces this with a storage engine that runs on local disks, network-attached storage, or cloud block volumes. The replication responsibility moves entirely into the Spanner Paxos layer rather than being split between the filesystem and the database.
Figure 2: Serializable vs Externally Consistent
Serializable but NOT Externally Consistent:
--------------------------------------------
Time -->
T1 commits (withdraw $100)
T2 starts (read balance)
T2 reads $200 (STALE!)
Bug: T2 started AFTER T1 committed, but saw old value.
Serializable allows this if "logical order" differs.
Externally Consistent (Spanner guarantee):
--------------------------------------------
Time -->
T1 commits (withdraw $100)
T2 starts (read balance)
T2 reads $100 (CORRECT)
Guarantee: If T2 starts after T1 commits, T2 sees T1's result.
No clock skew window. No causal inversion.
4. What Is Actually Available Today in Preview
This is where the product page and download docs are more honest than the launch post. The current public release is explicitly a developer preview with hard restrictions:
- 90-day limitation per deployment
- Non-commercial use — production workloads are not permitted
- No TLS encryption — all connections are unencrypted in the preview build
- No at-rest encryption
- No multi-role authentication — simplified access control only
- No backup/restore
- No audit logging
Google states that commercial production use requires contacting them directly. There is no public pricing. There is no self-service upgrade path from preview to production. If you are evaluating Omni for a real project, these limitations are not footnotes. They are architectural blockers for most regulated or revenue-generating deployments.
For monitoring, Spanner Omni supports logs, traces, statistics tables, Prometheus alerts, and Grafana dashboards. But it does not support end-to-end tracing, and client-side metrics/tracing are not supported. This is narrower than managed Spanner‘s observability surface.
What works today
- Full GoogleSQL and PostgreSQL dialects
- Relational, graph, vector, key-value, and full-text models
- Single-server to multi-cluster topologies
- Container images, Helm charts, and standalone binaries
- Prometheus and Grafana monitoring
- Built-in vector search (exact KNN and ANN)
Preview limitations
- No TLS (connections are plaintext)
- No at-rest encryption
- No backup/restore
- No audit logging
- No multi-role access control
- 90-day expiration
- Non-commercial use only
- No end-to-end distributed tracing
5. The AI Angle: Multi-Model, Vector, Full-Text, Graph, and Analytics
The launch post and product page both emphasize AI-enabled applications. The specific capabilities, documented in the vector search and multi-model overview pages, are concrete enough to evaluate without resorting to generic “AI-ready” claims.
Built-in Vector Search
Spanner Omni’s vector search is native to the engine, not bolted on via an extension. It supports:
- Exact KNN (K-Nearest Neighbors) for small, high-accuracy retrieval
- ANN (Approximate Nearest Neighbors) for large-scale semantic search
- Documented ANN support up to 1 million vectors at 128 dimensions, with capacity decreasing as dimensions increase
This is a meaningful constraint. If your embedding model outputs 768 or 1,536 dimensions — common for modern text embeddings — the supported vector count drops proportionally. The documentation is explicit about this tradeoff, which is more useful than vague “scalable vector search” language.
Hybrid Querying
Its multi-model architecture allows queries that combine vector similarity, full-text relevance, graph traversal, and relational filtering in a single query plan. A concrete pattern: find documents semantically similar to a query vector, ranked by text relevance, filtered by a graph relationship (e.g., “authored by someone in my organization”), and constrained by a relational predicate (e.g., “published after 2024”). In a traditional stack, this requires orchestrating Qdrant, Elasticsearch, Neo4j, and PostgreSQL. Spanner Omni claims to execute it in one system with one query.
The multi-model post gives a consolidation example: a customer replaced MongoDB, Neo4j, Elasticsearch, and Qdrant with this engine for an AI workflow. Whether this pattern generalizes depends on workload size, query complexity, and whether the operational simplicity outweighs potential performance specialization losses from dedicated engines.
Analytical Processing
Omni supports operational analytics via columnar capabilities. This is not a full OLAP replacement for BigQuery or Snowflake, but it allows transactional systems to run analytical queries without ETL to a separate warehouse. For AI pipelines, this means training-feature extraction, aggregate monitoring, and embedding-generation batch jobs can run against the same data store that serves real-time inference.
Figure 3: Multi-model query pattern
Single Query in Spanner Omni: ------------------------------- SELECT doc.id, doc.title, v.score FROM documents AS doc JOIN VECTOR_SEARCH(documents.embedding, @query_vec, top_k=10) AS v ON doc.id = v.id WHERE doc.published > '2024-01-01' AND FULLTEXT_MATCH(doc.body, @search_terms) ORDER BY v.score DESC; Traditional Stack (4 systems): ------------------------------- Qdrant -> vector similarity Elasticsearch -> full-text ranking Neo4j -> graph traversal PostgreSQL -> relational filtering Application -> manual join/merge in code
6. Where It Fits: Managed Spanner, AlloyDB Omni, CockroachDB, YugabyteDB, and Aurora DSQL
The best framing for Omni is not “same product everywhere,” but “same architectural family with different operational models.” Google says Omni brings the same core Spanner capabilities and design ideas — Paxos, sharding, strong external consistency, multi-model support — to self-managed environments. But today’s public developer edition is still preview-only, non-production, and missing key security and ops features, while Google recommends certified environments for production workloads.
Omni vs Managed Spanner
Google’s managed service with TLS, at-rest encryption, multi-role IAM, backup/restore, audit logging, and global load balancing. It runs only in Google Cloud. Omni is the engine without the operational wrapper. The consistency model, query languages, and data structures are the same. The difference is who operates it, where it runs, and what operational guarantees are included. If you need self-managed or multi-cloud deployment, the downloadable version is your only option. If you need production guarantees today, stick with the managed service.
Omni vs AlloyDB Omni
These are not the same category. AlloyDB Omni is a downloadable PostgreSQL-compatible engine with standard PostgreSQL tools, extensions, backups, vector capabilities, and a columnar engine. It is designed to run anywhere and targets “Postgres anywhere” use cases. Omni targets “globally consistent distributed SQL anywhere.” AlloyDB Omni is the better comparison for PostgreSQL migration and standard relational workloads. Omni is the better comparison for distributed transactional workloads that require external consistency across regions.
Omni vs CockroachDB Self-Hosted
CockroachDB is the cleanest live comparator. Its docs position self-hosted CockroachDB as a full-featured, self-managed deployment for private data centers, hybrid cloud, and multi-cloud use cases, with enterprise features enabled. Its AI docs emphasize native vector support, strong consistency, serializable transactions, and multi-region deployments. CockroachDB has public production posture today, with backup/restore, TLS, encryption, role-based access, and observable multi-region behavior. If you need a self-managed distributed SQL database in production now, CockroachDB is a mature alternative. The preview currently has stronger Google/Spanner lineage but a more constrained public preview.
Omni vs YugabyteDB Anywhere
YugabyteDB Anywhere is another direct comparator because it is explicitly a self-managed DBaaS for on-prem, public cloud, Kubernetes, and multi-region/multi-cloud deployments. Its current docs show pgvector support with indexing and HNSW in the 2025.2 LTS line, plus xCluster replication for disaster-recovery patterns. YugabyteDB uses Raft (similar to Paxos) for consensus, supports strong consistency, and offers a PostgreSQL-compatible query layer. It is production-ready today with full operational tooling. For the “self-managed distributed SQL plus AI/vector plus hybrid/multicloud” story, YugabyteDB Anywhere is the most mature direct competitor.
Omni vs Aurora DSQL
AWS positions Aurora DSQL as a serverless, PostgreSQL-compatible, strongly consistent distributed SQL service with active-active multi-region behavior and high availability inside AWS. It is a comparison for the distributed SQL market, but not for the specific “downloadable, self-managed, run it in your own data center” angle that makes Spanner Omni notable. Aurora DSQL is managed-only and AWS-only. Omni is self-managed and infrastructure-agnostic. They solve different operational constraints.
Omni vs Azure Cosmos DB for PostgreSQL
Microsoft’s own docs say Azure Cosmos DB for PostgreSQL is on a retirement path and is no longer recommended for new projects. It can be mentioned as historical market context, but it should not be treated as a forward-looking benchmark.
| Product | Deployment | Consistency | Production Ready | Vector / AI | Best For |
|---|---|---|---|---|---|
| Managed Spanner | Google Cloud only | External consistency | Yes | Built-in | Global GCP workloads |
| Spanner Omni | Any infrastructure | External consistency | Preview only | Built-in | Future hybrid/multicloud |
| AlloyDB Omni | Any infrastructure | Serializable | Yes | pgvector | PostgreSQL migration |
| CockroachDB | Self-hosted/cloud | Serializable | Yes | Native vector | Production distributed SQL |
| YugabyteDB | Self-hosted/K8s | Strong (Raft) | Yes | pgvector + HNSW | Multi-cloud Kubernetes |
| Aurora DSQL | AWS only | Strong | Yes | pgvector | Serverless AWS SQL |
7. Production Caveats and Unanswered Questions
The article should not present the public preview as production-ready. The docs directly contradict that. Nor should it turn the performance claim into an objective benchmark comparison, because the public material gives the headline claim but not benchmark methodology, workload mix, or third-party validation.
Key unanswered questions that any serious evaluation should investigate:
- Clock synchronization accuracy: How does the software-based TrueTime equivalent perform under NTP drift, VM clock jitter, and cross-region latency? Google has not published bound numbers for the Omni implementation.
- Failover behavior: Managed Spanner handles regional failover automatically. In a self-managed multi-cluster deployment, failover is operator responsibility. The docs do not yet detail recommended runbooks.
- Storage performance: Colossus provided erasure-coded, globally replicated storage. Local disks or cloud block volumes have different failure modes. How does Spanner Omni handle disk failure, network partition, and split-brain scenarios without Colossus?
- Upgrade path: Will preview deployments migrate to production licensing, or require rebuild? There is no documented upgrade procedure.
- Licensing model: No public pricing exists. Enterprise database procurement requires predictable cost structures. The lack of public pricing is a practical barrier to adoption planning.
8. Who Should Care Right Now
Omni is relevant to three audiences today, each with different actionability:
Enterprise architects planning 2027–2028 infrastructure: If your organization is evaluating multi-cloud or hybrid strategies and needs a consistent transactional layer across environments, Omni is worth tracking. The preview allows you to test data models, query patterns, and deployment topologies before a production release. The 90-day limitation means you should plan iterative evaluation cycles rather than a single long-term proof of concept.
AI engineering teams building retrieval-augmented generation (RAG) systems: The built-in vector search, hybrid querying, and unified transactional + retrieval architecture could reduce system complexity. But the 1 million vector limit at 128 dimensions and lack of TLS mean it is suitable for prototyping, not for customer-facing inference pipelines. Use it to validate whether the multi-model query performance meets your latency requirements before committing to a consolidated stack.
Google Cloud customers with Spanner-dependent applications: If you are already running managed Spanner and need to extend to on-prem or edge environments, the preview is the first chance to test portability. The PostgreSQL dialect support may simplify migration paths from existing Postgres workloads. But the missing backup/restore and encryption features mean you should not treat this as a production extension yet.
“Omni is not a product you deploy today. It is a signal about where Google is taking distributed SQL — and a credible preview that justifies early evaluation, not early adoption.”
9. SEO and Technical Context
Spanner Omni enters a search landscape where queries for “distributed SQL,” “self-hosted CockroachDB alternative,” “Postgres-compatible distributed database,” and “vector database for AI” overlap significantly. The article targets long-tail enterprise search terms: “Omni vs CockroachDB,” “Google Spanner on-premises,” “distributed SQL with vector search,” and “multi-model database for AI workloads.”
The technical specificity — external consistency, TrueTime replacement, preview limitations, vector dimension constraints — is what separates this content from generic AI database coverage. Search engines and technical decision-makers reward specificity over superlatives.
Key entities and terms: Spanner Omni, Google Cloud Spanner, TrueTime, Colossus, external consistency, strict serializability, Paxos replication, multi-model database, vector search, ANN, KNN, hybrid querying, CockroachDB, YugabyteDB, AlloyDB Omni, Aurora DSQL, distributed SQL, self-managed database, Kubernetes database, on-premises database, AI-ready database, RAG architecture.
10. Getting Started: Spanner Omni Implementation Guide
This section walks through the practical steps of evaluating Spanner Omni in your environment, from download to a working multi-model query. Every step is based on the current public preview documentation and reflects its limitations honestly.
Figure 4: Spanner Omni evaluation journey — from download to production readiness assessment
+------------------+ +------------------+ +------------------+
| PHASE 1 | | PHASE 2 | | PHASE 3 |
| Download & | | Data Model | | Production |
| Deploy | | & Query Test | | Readiness |
+--------+---------+ +--------+---------+ +--------+---------+
| | |
v v v
1. Get preview access 4. Create schema 7. Security audit
2. Choose deployment 5. Load vector data 8. Backup/restore
topology 6. Run hybrid queries gap analysis
3. Configure & start 7. Benchmark latency 9. Licensing inquiry
Spanner Omni 8. Test failover 10. Go / No-Go decision
Time: Day 1-3 Time: Day 4-14 Time: Day 15-30
Status: Anyone can do Status: Engineers only Status: Architects + Procurement
Phase 1: Download and Deploy
Omni is distributed through Google’s download portal. You need a Google Cloud account to access the preview. The developer edition is free but expires after 90 days and is non-commercial.
Step 1: Choose Your Deployment Format
Figure 5: Deployment format decision tree
What infrastructure do you have for deployment?
|
+---------------+---------------+
| | |
Kubernetes? Docker only? Bare metal /
| | VM only?
v v |
Helm Chart Container Standalone
deployment image binary
| | |
v v v
helm install docker run ./spanner-omni
-f values.yaml -p 9010:9010 --port 9010
spanner-omni spanner-omni --data-dir /data
./chart/ :latest
Common to all:
- Minimum 4 CPU, 16 GB RAM per node
- SSD storage recommended
- NTP must be configured (TrueTime depends on it)
- Port 9010 (gRPC), Port 9011 (HTTP/REST)
Step 2: Single-Server vs Multi-Node Topology
Figure 6: Topology selection guide
EVALUATION USE (single-server):
================================
+------------------+
| Developer |
| Laptop / VM |
| +-------------+ |
| | Spanner Omni| |
| | (all-in-one)| |
| | | |
| | - Leader | |
| | - Replica | |
| | - Storage | |
| +-------------+ |
+------------------+
Good for: Schema design, query testing,
vector search prototyping, learning SQL dialects
PRODUCTION-LIKE (multi-zone):
================================
Zone A Zone B Zone C
+---------------+ +---------------+ +---------------+
| +-----------+ | | +-----------+ | | +-----------+ |
| | Leader |<------>| Replica |<------>| Replica | |
| | (voter) | | | | (voter) | | | | (voter) | |
| +-----------+ | | +-----------+ | | +-----------+ |
| | Storage | | | | Storage | | | | Storage | |
| +-----------+ | | +-----------+ | | +-----------+ |
+---------------+ +---------------+ +---------------+
Good for: Consistency testing, failover simulation,
latency benchmarking across zones
MULTI-CLUSTER (hybrid/multicloud):
================================
+-- Cluster 1 (GCP) --+ +-- Cluster 2 (On-Prem) --+
| Zone A Zone B | | Zone C Zone D |
| [Leader] [Replica] | | [Replica] [Replica] |
+----------------------+ +-------------------------+
| |
+---- Paxos across WAN ----+
Good for: Cross-cloud consistency validation,
air-gapped scenario testing (disconnected mode)
Step 3: Critical Configuration Checklist
Must configure before starting
- NTP synchronization — TrueTime software equivalent requires accurate clocks. Configure NTP with
ntpdorchronydpointing to multiple upstream servers. Clock drift must stay under 10ms. - Storage path — Point
--data-dirto an SSD-backed volume. HDD storage will cause commit latency spikes. - Port firewall — Open 9010 (gRPC), 9011 (HTTP) between all nodes. Multi-zone requires inter-zone connectivity.
- Resource limits — Set container memory limits to at least 16GB. Spanner Omni’s Paxos layer is memory-intensive.
Preview gotchas to watch for
- No TLS — All traffic is plaintext. Do not expose ports to the internet. Use a VPN or private network for multi-node setups.
- 90-day expiry — Set a calendar reminder. The instance will stop accepting writes after 90 days.
- No backup — There is no built-in backup/restore. Use filesystem snapshots or
pg_dumpvia PostgreSQL dialect as a workaround. - Single admin role — No role-based access. Anyone with the admin key can do anything. Guard credentials carefully.
Phase 2: Data Modeling and Query Testing
Step 4: Create a Multi-Model Schema
Its multi-model support means you define relational tables, vector columns, graph edges, and full-text indexes in one schema. Here is a concrete example for an AI document retrieval system:
Figure 7: Multi-model schema for AI document retrieval
RELATIONAL TABLE: ================= CREATE TABLE documents ( id INT64 NOT NULL, title STRING(500), body STRING(MAX), author_id INT64, published TIMESTAMP, category STRING(100) ) PRIMARY KEY (id); VECTOR COLUMN (added to same table): ===================================== ALTER TABLE documents ADD COLUMN embedding ARRAY; FULL-TEXT INDEX: ================ CREATE SEARCH INDEX docs_body_index ON documents(body) OPTIONS (update_deadline_seconds = 300); GRAPH SCHEMA (knowledge graph): ================================ CREATE TABLE authors ( id INT64 NOT NULL, name STRING(200), org_id INT64 ) PRIMARY KEY (id); CREATE TABLE citations ( from_doc INT64 NOT NULL, to_doc INT64 NOT NULL, type STRING(50) -- 'references', 'contradicts', 'extends' ) PRIMARY KEY (from_doc, to_doc), INTERLEAVE IN PARENT documents ON DELETE CASCADE; PROPERTY GRAPH: ================ CREATE PROPERTY GRAPH doc_graph NODE TABLES (documents, authors) EDGE TABLES (citations SOURCE KEY (from_doc) REFERENCES documents(id) DESTINATION KEY (to_doc) REFERENCES documents(id) );
Step 5: Load Vector Data and Run Hybrid Queries
Figure 8: Query execution flow — how Spanner Omni processes a hybrid vector + text + graph query
Example query: "Find recent AI papers about distributed consistency
that cite the Spanner 2012 paper"
+---------------------------------------------------------------+
| SPANNER OMNI QUERY ENGINE |
| |
| 1. VECTOR SEARCH |
| +------------------+ |
| | Embed query | ANN index scan |
| | text -> vector | Top 100 candidates |
| +--------+--------+ |
| | |
| 2. FULL-TEXT FILTER |
| +--------v--------+ |
| | FULLTEXT_MATCH | Re-rank by text relevance |
| | on body column | Narrow to 50 results |
| +--------+--------+ |
| | |
| 3. GRAPH TRAVERSAL |
| +--------v--------+ |
| | Graph query on | Follow citation edges |
| | doc_graph | Find papers citing Spanner 2012 |
| +--------+--------+ Narrow to 12 results |
| | |
| 4. RELATIONAL FILTER |
| +--------v--------+ |
| | WHERE published | Filter by date |
| | > '2024-01-01' | Final: 8 results |
| +--------+--------+ |
| | |
| v |
| +------------------+ |
| | Result set: | |
| | 8 documents with | |
| | score + path + | |
| | metadata | |
| +------------------+ |
+---------------------------------------------------------------+
In a traditional stack, steps 1-4 run on 4 different systems
and the application must merge results in code.
The engine executes this as a single query plan.
Step 6: Benchmark What Matters
Do not benchmark generic CRUD. Test the patterns that differentiate it:
Figure 9: Benchmark matrix for Spanner Omni evaluation
+-------------------------+----------------+----------------+----------------+ | Benchmark | What to measure| Target | How to test | +-------------------------+----------------+----------------+----------------+ | Write latency | p50, p99 | < 10ms p99 | INSERT loop | | (single-zone) | commit time | | 10K writes | +-------------------------+----------------+----------------+----------------+ | Cross-zone write | p99 with | < 30ms p99 | Write from | | latency | Paxos round | | Zone A, read | | | trip | | from Zone B | +-------------------------+----------------+----------------+----------------+ | Vector ANN search | Recall@10 | > 95% | Known dataset | | (1M vectors, 128 dim) | + latency | < 50ms | with ground | | | | | truth labels | +-------------------------+----------------+----------------+----------------+ | Hybrid query | End-to-end | < 100ms | Vector + text | | (vector + text + graph) | latency | | + graph filter | +-------------------------+----------------+----------------+----------------+ | Failover recovery | Time to new | < 30 seconds | Kill leader | | | leader elected | | node, measure | +-------------------------+----------------+----------------+----------------+ | Auto-resharding | Time to | < 5 minutes | Bulk load data | | | rebalance | | until split | +-------------------------+----------------+----------------+----------------+ IMPORTANT: Preview builds may not meet these targets. Document actual numbers for your hardware and compare against CockroachDB / YugabyteDB on same infrastructure.
Phase 3: Production Readiness Assessment
Step 7: Security Gap Analysis
Figure 10: Security feature comparison — preview vs production requirements
Figure 11: Decision tree — should your organization adopt Spanner Omni now?
Need distributed SQL in production NOW? | +-----------+-----------+ | | YES NO | | v v Use CockroachDB or Need self-managed Spanner YugabyteDB Anywhere. specifically? They have TLS, backup, | encryption, RBAC today. +-----+-----+ | | YES NO | | v v Can you wait for Use managed Spanner production release? on GCP or AlloyDB | Omni for Postgres. +-----+-----+ | | YES NO | | v v Evaluate preview Use CockroachDB / for architecture YugabyteDB now, planning. Track plan Spanner Omni Google's release migration later. roadmap. RECOMMENDATION FOR MOST ORGANIZATIONS: -------------------------------------- 1. NOW: CockroachDB or YugabyteDB for production 2. PARALLEL: Evaluate Spanner Omni preview for 2027 planning 3. LATER: Migrate when Google ships production-certified build