Spanner Omni: Google’s Distributed SQL Leaves the Cloud — What It Means for Enterprise Architecture and AI Workloads

For over a decade, Spanner was the database Google would not let you run. It sat inside Google’s private infrastructure, bound to custom hardware, atomic clocks, and a globally distributed filesystem called Colossus. Now Google has released Spanner Omni — a downloadable, self-managed version designed to run on VMs, Kubernetes, on-premises data centers, and even your laptop. This is not a managed cloud service re-packaged. It is an attempt to export one of the most sophisticated distributed databases ever built outside Google’s walls, without the infrastructure that originally made it possible.

TL;DR

Omni is a downloadable Spanner engine that runs on-prem, multi-cloud, Kubernetes, and local machines.
It replaces hardware-backed TrueTime and Colossus with software-based equivalents.
Current public release is a 90-day developer preview, explicitly non-production, missing TLS, at-rest encryption, multi-role auth, backup/restore, and audit logging.
Supports relational, graph, vector, key-value, full-text, and analytical processing in one engine.
Direct competitors: CockroachDB self-hosted and YugabyteDB Anywhere for production self-managed distributed SQL; AlloyDB Omni for PostgreSQL-anywhere; Aurora DSQL for managed distributed SQL only.
Commercial pricing does not exist publicly — contact Google for production licensing.

1. What Spanner Omni Is

Omni is Google’s effort to make the core Spanner engine runnable anywhere. The managed version of Spanner — launched publicly in 2017 — is a globally distributed, strongly consistent SQL database that uses Paxos replication, automatic sharding, and external consistency to serve transactional workloads across regions with single-digit millisecond latency. It is the system behind Google Ads, Google Play, Gmail metadata, and Google Photos. It was built around two pieces of infrastructure Google does not sell: TrueTime, a time-synchronization service backed by atomic clocks and GPS, and Colossus, a distributed filesystem that handles storage under the hood.

Spanner Omni removes both dependencies. Google replaced hardware-backed TrueTime with a software-based equivalent and re-implemented the storage layer so it no longer requires Colossus. The result is a containerized, binary, or Helm-chart distribution that runs on your infrastructure.

What makes this different from “just another PostgreSQL-compatible distributed database” is that Spanner Omni retains the original architectural model:

Paxos-based synchronous replication across nodes for durability and consensus
External consistency — the strictest transactional guarantee in production databases, stronger than serializable isolation alone
Automatic resharding based on load and data size without manual partitioning
Multi-model support — relational tables, graph structures, vector embeddings, key-value pairs, full-text search, and analytical queries in one engine
GoogleSQL and PostgreSQL dialects, plus the Spanner Graph Language

The product page confirms Omni runs in single-server, single-zone, multi-zone, and multi-cluster topologies. You can deploy it on virtual machines, container platforms, Kubernetes clusters, on-premises servers, across multiple cloud providers, or locally for development. Google distributes it as container images, Helm charts, and standalone binaries.

Figure 1: Deployment topology options for Spanner Omni

+---------------------------------------------+
|          Spanner Omni Engine                |
|  +---------+ +---------+ +---------------+  |
|  | GoogleSQL | |PostgreSQL| |Spanner Graph|  |
|  |  Dialect  | | Dialect  | |   Language   |  |
|  +----+----+ +----+----+ +-------+-------+  |
|       +------------+------------+           |
|                    |                          |
|         +----------v----------+               |
|         |  Paxos Replication  |               |
|         |  + Auto-Sharding    |               |
|         +----------+----------+               |
|                    |                          |
|    +---------------v--------------+            |
|    | Software-based TrueTime +  |            |
|    | Local Storage (No Colossus)|            |
|    +---------------+--------------+            |
+---------------------------------------------+
         |              |              |
    +----v----+   +-----v----+   +------v------+
    | Single  |   | Multi-   |   |  Kubernetes |
    | Server  |   | Cluster  |   |   / On-Prem |
    |  (Dev)  |   | (Prod)   |   |  / Multi-Cloud|
    +---------+   +----------+   +-------------+

2. Why Google Launched It Now

The timing is strategic, not arbitrary. Three market pressures converged:

Hybrid and multi-cloud resilience: Enterprises running critical systems no longer want single-cloud dependency. Regulatory requirements, cost optimization, and disaster-recovery planning are pushing workloads across providers and into private data centers. A database that only runs in Google Cloud is a strategic liability in procurement conversations.

Application portability: Organizations building long-lived systems want to avoid cloud lock-in at the data layer. If an application is architected around Spanner’s consistency model, moving it to AWS or Azure previously meant rewriting against a different database. Spanner Omni removes that friction.

On-premises modernization: Large enterprises with existing data center investments — financial services, telecom, government — need modern distributed databases without abandoning physical infrastructure they still pay for. Spanner Omni lets Google participate in those environments.

Air-gapped deployments: Defense, intelligence, and critical infrastructure operators run disconnected networks. Managed cloud services are impossible in these environments. A downloadable Spanner is the only way Google can serve this market.

Google’s launch post frames Spanner Omni as internally battle-tested at “millions of queries per second across petabytes of data.” The important caveat: this claim is based on Google’s own internal benchmark tests, not independently validated third-party workloads. It should be treated as indicative of architectural potential, not a guaranteed performance contract for your deployment.

3. Why This Is Technically Hard: TrueTime, Colossus, and the Consistency Problem

The deepest engineering story in Omni is not marketing. It is how Google attempted to replicate Spanner’s external consistency without the two proprietary systems that originally made it work.

External Consistency and Strict Serializability

Spanner’s signature guarantee is external consistency, which is stronger than standard serializable isolation. In a serializable database, transactions appear to execute in some sequential order. In an externally consistent database, that order also matches real time. If transaction A commits before transaction B starts, then every observer will see A’s effects before B’s. There are no anomalies, no causal inversions, no clock-skew windows where stale reads are possible.

This is not a marketing distinction. In financial trading, supply-chain provenance, inventory reservation, and multi-party contract systems, real-time causal ordering prevents categories of bugs that serializable isolation alone cannot.

Classic Spanner achieved external consistency through TrueTime, a time-synchronization service exposed as an API. TrueTime returns time intervals, not single timestamps. It guarantees that the actual current time lies within the returned interval. Spanner uses these intervals to assign commit timestamps and enforce that if one transaction finishes before another starts, the first receives a strictly lower timestamp. This eliminates the need for locking or validation during commit.

The Original Dependency Stack

Classic Spanner sat on top of two systems Google does not license:

TrueTime — atomic clocks, GPS receivers, and a synchronization protocol between Google data centers providing bounded uncertainty intervals
Colossus — a distributed filesystem handling replication, storage layout, and recovery under the database layer

Neither can be packaged into a downloadable product. So Spanner Omni required two replacements:

Software-based TrueTime equivalent: Google does not disclose the full algorithm, but the launch post explicitly states they replaced hardware-backed TrueTime with a software-based equivalent. In distributed systems literature, this typically means combining synchronized clock protocols (like HLC — Hybrid Logical Clocks, or NTP with tight bounds and failure detection) with careful timestamp assignment and commit-wait logic. The key challenge is maintaining bounded clock uncertainty without atomic clocks. If the uncertainty interval grows too large, commit latency increases. If it is underestimated, consistency violations become possible. Google’s claim is that their software equivalent maintains the same guarantees within acceptable latency bounds.

Local storage layer: Colossus handled cross-node storage replication and recovery. Spanner Omni replaces this with a storage engine that runs on local disks, network-attached storage, or cloud block volumes. The replication responsibility moves entirely into the Spanner Paxos layer rather than being split between the filesystem and the database.

Figure 2: Serializable vs Externally Consistent

Serializable but NOT Externally Consistent:
--------------------------------------------
Time -->
T1 commits (withdraw $100)
              T2 starts (read balance)
                               T2 reads $200 (STALE!)
Bug: T2 started AFTER T1 committed, but saw old value.
Serializable allows this if "logical order" differs.

Externally Consistent (Spanner guarantee):
--------------------------------------------
Time -->
T1 commits (withdraw $100)
              T2 starts (read balance)
                               T2 reads $100 (CORRECT)
Guarantee: If T2 starts after T1 commits, T2 sees T1's result.
No clock skew window. No causal inversion.

4. What Is Actually Available Today in Preview

This is where the product page and download docs are more honest than the launch post. The current public release is explicitly a developer preview with hard restrictions:

90-day limitation per deployment
Non-commercial use — production workloads are not permitted
No TLS encryption — all connections are unencrypted in the preview build
No at-rest encryption
No multi-role authentication — simplified access control only
No backup/restore
No audit logging

Google states that commercial production use requires contacting them directly. There is no public pricing. There is no self-service upgrade path from preview to production. If you are evaluating Omni for a real project, these limitations are not footnotes. They are architectural blockers for most regulated or revenue-generating deployments.

For monitoring, Spanner Omni supports logs, traces, statistics tables, Prometheus alerts, and Grafana dashboards. But it does not support end-to-end tracing, and client-side metrics/tracing are not supported. This is narrower than managed Spanner‘s observability surface.

What works today

Full GoogleSQL and PostgreSQL dialects
Relational, graph, vector, key-value, and full-text models
Single-server to multi-cluster topologies
Container images, Helm charts, and standalone binaries
Prometheus and Grafana monitoring
Built-in vector search (exact KNN and ANN)

Preview limitations

No TLS (connections are plaintext)
No at-rest encryption
No backup/restore
No audit logging
No multi-role access control
90-day expiration
Non-commercial use only
No end-to-end distributed tracing

5. The AI Angle: Multi-Model, Vector, Full-Text, Graph, and Analytics

The launch post and product page both emphasize AI-enabled applications. The specific capabilities, documented in the vector search and multi-model overview pages, are concrete enough to evaluate without resorting to generic “AI-ready” claims.

Built-in Vector Search

Spanner Omni’s vector search is native to the engine, not bolted on via an extension. It supports:

Exact KNN (K-Nearest Neighbors) for small, high-accuracy retrieval
ANN (Approximate Nearest Neighbors) for large-scale semantic search
Documented ANN support up to 1 million vectors at 128 dimensions, with capacity decreasing as dimensions increase

This is a meaningful constraint. If your embedding model outputs 768 or 1,536 dimensions — common for modern text embeddings — the supported vector count drops proportionally. The documentation is explicit about this tradeoff, which is more useful than vague “scalable vector search” language.

Hybrid Querying

Its multi-model architecture allows queries that combine vector similarity, full-text relevance, graph traversal, and relational filtering in a single query plan. A concrete pattern: find documents semantically similar to a query vector, ranked by text relevance, filtered by a graph relationship (e.g., “authored by someone in my organization”), and constrained by a relational predicate (e.g., “published after 2024”). In a traditional stack, this requires orchestrating Qdrant, Elasticsearch, Neo4j, and PostgreSQL. Spanner Omni claims to execute it in one system with one query.

The multi-model post gives a consolidation example: a customer replaced MongoDB, Neo4j, Elasticsearch, and Qdrant with this engine for an AI workflow. Whether this pattern generalizes depends on workload size, query complexity, and whether the operational simplicity outweighs potential performance specialization losses from dedicated engines.

Analytical Processing

Omni supports operational analytics via columnar capabilities. This is not a full OLAP replacement for BigQuery or Snowflake, but it allows transactional systems to run analytical queries without ETL to a separate warehouse. For AI pipelines, this means training-feature extraction, aggregate monitoring, and embedding-generation batch jobs can run against the same data store that serves real-time inference.

Figure 3: Multi-model query pattern

Single Query in Spanner Omni:
-------------------------------
SELECT doc.id, doc.title, v.score
FROM documents AS doc
JOIN VECTOR_SEARCH(documents.embedding, @query_vec, top_k=10) AS v
  ON doc.id = v.id
WHERE doc.published > '2024-01-01'
  AND FULLTEXT_MATCH(doc.body, @search_terms)
ORDER BY v.score DESC;

Traditional Stack (4 systems):
-------------------------------
Qdrant      -> vector similarity
Elasticsearch -> full-text ranking
Neo4j       -> graph traversal
PostgreSQL  -> relational filtering
Application -> manual join/merge in code

6. Where It Fits: Managed Spanner, AlloyDB Omni, CockroachDB, YugabyteDB, and Aurora DSQL

The best framing for Omni is not “same product everywhere,” but “same architectural family with different operational models.” Google says Omni brings the same core Spanner capabilities and design ideas — Paxos, sharding, strong external consistency, multi-model support — to self-managed environments. But today’s public developer edition is still preview-only, non-production, and missing key security and ops features, while Google recommends certified environments for production workloads.

Omni vs Managed Spanner

Google’s managed service with TLS, at-rest encryption, multi-role IAM, backup/restore, audit logging, and global load balancing. It runs only in Google Cloud. Omni is the engine without the operational wrapper. The consistency model, query languages, and data structures are the same. The difference is who operates it, where it runs, and what operational guarantees are included. If you need self-managed or multi-cloud deployment, the downloadable version is your only option. If you need production guarantees today, stick with the managed service.

Omni vs AlloyDB Omni

These are not the same category. AlloyDB Omni is a downloadable PostgreSQL-compatible engine with standard PostgreSQL tools, extensions, backups, vector capabilities, and a columnar engine. It is designed to run anywhere and targets “Postgres anywhere” use cases. Omni targets “globally consistent distributed SQL anywhere.” AlloyDB Omni is the better comparison for PostgreSQL migration and standard relational workloads. Omni is the better comparison for distributed transactional workloads that require external consistency across regions.

Omni vs CockroachDB Self-Hosted

CockroachDB is the cleanest live comparator. Its docs position self-hosted CockroachDB as a full-featured, self-managed deployment for private data centers, hybrid cloud, and multi-cloud use cases, with enterprise features enabled. Its AI docs emphasize native vector support, strong consistency, serializable transactions, and multi-region deployments. CockroachDB has public production posture today, with backup/restore, TLS, encryption, role-based access, and observable multi-region behavior. If you need a self-managed distributed SQL database in production now, CockroachDB is a mature alternative. The preview currently has stronger Google/Spanner lineage but a more constrained public preview.

Omni vs YugabyteDB Anywhere

YugabyteDB Anywhere is another direct comparator because it is explicitly a self-managed DBaaS for on-prem, public cloud, Kubernetes, and multi-region/multi-cloud deployments. Its current docs show pgvector support with indexing and HNSW in the 2025.2 LTS line, plus xCluster replication for disaster-recovery patterns. YugabyteDB uses Raft (similar to Paxos) for consensus, supports strong consistency, and offers a PostgreSQL-compatible query layer. It is production-ready today with full operational tooling. For the “self-managed distributed SQL plus AI/vector plus hybrid/multicloud” story, YugabyteDB Anywhere is the most mature direct competitor.

Omni vs Aurora DSQL

AWS positions Aurora DSQL as a serverless, PostgreSQL-compatible, strongly consistent distributed SQL service with active-active multi-region behavior and high availability inside AWS. It is a comparison for the distributed SQL market, but not for the specific “downloadable, self-managed, run it in your own data center” angle that makes Spanner Omni notable. Aurora DSQL is managed-only and AWS-only. Omni is self-managed and infrastructure-agnostic. They solve different operational constraints.

Omni vs Azure Cosmos DB for PostgreSQL

Microsoft’s own docs say Azure Cosmos DB for PostgreSQL is on a retirement path and is no longer recommended for new projects. It can be mentioned as historical market context, but it should not be treated as a forward-looking benchmark.

Product	Deployment	Consistency	Production Ready	Vector / AI	Best For
Managed Spanner	Google Cloud only	External consistency	Yes	Built-in	Global GCP workloads
Spanner Omni	Any infrastructure	External consistency	Preview only	Built-in	Future hybrid/multicloud
AlloyDB Omni	Any infrastructure	Serializable	Yes	pgvector	PostgreSQL migration
CockroachDB	Self-hosted/cloud	Serializable	Yes	Native vector	Production distributed SQL
YugabyteDB	Self-hosted/K8s	Strong (Raft)	Yes	pgvector + HNSW	Multi-cloud Kubernetes
Aurora DSQL	AWS only	Strong	Yes	pgvector	Serverless AWS SQL

7. Production Caveats and Unanswered Questions

The article should not present the public preview as production-ready. The docs directly contradict that. Nor should it turn the performance claim into an objective benchmark comparison, because the public material gives the headline claim but not benchmark methodology, workload mix, or third-party validation.

Key unanswered questions that any serious evaluation should investigate:

Clock synchronization accuracy: How does the software-based TrueTime equivalent perform under NTP drift, VM clock jitter, and cross-region latency? Google has not published bound numbers for the Omni implementation.
Failover behavior: Managed Spanner handles regional failover automatically. In a self-managed multi-cluster deployment, failover is operator responsibility. The docs do not yet detail recommended runbooks.
Storage performance: Colossus provided erasure-coded, globally replicated storage. Local disks or cloud block volumes have different failure modes. How does Spanner Omni handle disk failure, network partition, and split-brain scenarios without Colossus?
Upgrade path: Will preview deployments migrate to production licensing, or require rebuild? There is no documented upgrade procedure.
Licensing model: No public pricing exists. Enterprise database procurement requires predictable cost structures. The lack of public pricing is a practical barrier to adoption planning.

Verdict: Preview, Not Production

Omni is a credible and technically interesting preview of what could become a major product. The architecture is sound, the lineage is proven, and the multi-model AI capabilities are concrete. But the current public release is explicitly non-production. Organizations should evaluate it for future architecture planning, not for immediate deployment. When Google releases a production-certified version with TLS, encryption, backup/restore, and documented pricing, the competitive landscape will shift. Until then, CockroachDB and YugabyteDB remain the safer self-managed distributed SQL choices for production AI workloads.

8. Who Should Care Right Now

Omni is relevant to three audiences today, each with different actionability:

Enterprise architects planning 2027–2028 infrastructure: If your organization is evaluating multi-cloud or hybrid strategies and needs a consistent transactional layer across environments, Omni is worth tracking. The preview allows you to test data models, query patterns, and deployment topologies before a production release. The 90-day limitation means you should plan iterative evaluation cycles rather than a single long-term proof of concept.

AI engineering teams building retrieval-augmented generation (RAG) systems: The built-in vector search, hybrid querying, and unified transactional + retrieval architecture could reduce system complexity. But the 1 million vector limit at 128 dimensions and lack of TLS mean it is suitable for prototyping, not for customer-facing inference pipelines. Use it to validate whether the multi-model query performance meets your latency requirements before committing to a consolidated stack.

Google Cloud customers with Spanner-dependent applications: If you are already running managed Spanner and need to extend to on-prem or edge environments, the preview is the first chance to test portability. The PostgreSQL dialect support may simplify migration paths from existing Postgres workloads. But the missing backup/restore and encryption features mean you should not treat this as a production extension yet.

“Omni is not a product you deploy today. It is a signal about where Google is taking distributed SQL — and a credible preview that justifies early evaluation, not early adoption.”

9. SEO and Technical Context

Spanner Omni enters a search landscape where queries for “distributed SQL,” “self-hosted CockroachDB alternative,” “Postgres-compatible distributed database,” and “vector database for AI” overlap significantly. The article targets long-tail enterprise search terms: “Omni vs CockroachDB,” “Google Spanner on-premises,” “distributed SQL with vector search,” and “multi-model database for AI workloads.”

The technical specificity — external consistency, TrueTime replacement, preview limitations, vector dimension constraints — is what separates this content from generic AI database coverage. Search engines and technical decision-makers reward specificity over superlatives.

Key entities and terms: Spanner Omni, Google Cloud Spanner, TrueTime, Colossus, external consistency, strict serializability, Paxos replication, multi-model database, vector search, ANN, KNN, hybrid querying, CockroachDB, YugabyteDB, AlloyDB Omni, Aurora DSQL, distributed SQL, self-managed database, Kubernetes database, on-premises database, AI-ready database, RAG architecture.

10. Getting Started: Spanner Omni Implementation Guide

This section walks through the practical steps of evaluating Spanner Omni in your environment, from download to a working multi-model query. Every step is based on the current public preview documentation and reflects its limitations honestly.

Figure 4: Spanner Omni evaluation journey — from download to production readiness assessment

+------------------+     +------------------+     +------------------+
|   PHASE 1        |     |   PHASE 2        |     |   PHASE 3        |
|   Download &     |     |   Data Model     |     |   Production     |
|   Deploy         |     |   & Query Test   |     |   Readiness      |
+--------+---------+     +--------+---------+     +--------+---------+
         |                        |                        |
         v                        v                        v
 1. Get preview access      4. Create schema           7. Security audit
 2. Choose deployment       5. Load vector data         8. Backup/restore
    topology                6. Run hybrid queries           gap analysis
 3. Configure & start       7. Benchmark latency        9. Licensing inquiry
    Spanner Omni            8. Test failover           10. Go / No-Go decision

Time: Day 1-3               Time: Day 4-14              Time: Day 15-30
Status: Anyone can do       Status: Engineers only      Status: Architects + Procurement

Phase 1: Download and Deploy

Omni is distributed through Google’s download portal. You need a Google Cloud account to access the preview. The developer edition is free but expires after 90 days and is non-commercial.

Step 1: Choose Your Deployment Format

Figure 5: Deployment format decision tree

                    What infrastructure do you have for deployment?
                              |
              +---------------+---------------+
              |               |               |
         Kubernetes?      Docker only?    Bare metal /
              |               |            VM only?
              v               v               |
       Helm Chart        Container         Standalone
       deployment        image             binary
              |               |               |
              v               v               v
    helm install         docker run       ./spanner-omni
    -f values.yaml       -p 9010:9010    --port 9010
    spanner-omni         spanner-omni     --data-dir /data
    ./chart/            :latest

Common to all:
  - Minimum 4 CPU, 16 GB RAM per node
  - SSD storage recommended
  - NTP must be configured (TrueTime depends on it)
  - Port 9010 (gRPC), Port 9011 (HTTP/REST)

Step 2: Single-Server vs Multi-Node Topology

Figure 6: Topology selection guide

EVALUATION USE (single-server):
================================
+------------------+
|  Developer        |
|  Laptop / VM      |
|  +-------------+  |
|  | Spanner Omni|  |
|  | (all-in-one)|  |
|  |             |  |
|  | - Leader    |  |
|  | - Replica   |  |
|  | - Storage   |  |
|  +-------------+  |
+------------------+
  Good for: Schema design, query testing,
  vector search prototyping, learning SQL dialects

PRODUCTION-LIKE (multi-zone):
================================
      Zone A              Zone B              Zone C
+---------------+   +---------------+   +---------------+
| +-----------+ |   | +-----------+ |   | +-----------+ |
| |  Leader   |<------>|  Replica  |<------>|  Replica  | |
| |  (voter)  | |   | |  (voter)  | |   | |  (voter)  | |
| +-----------+ |   | +-----------+ |   | +-----------+ |
| |  Storage  | |   | |  Storage  | |   | |  Storage  | |
| +-----------+ |   | +-----------+ |   | +-----------+ |
+---------------+   +---------------+   +---------------+
  Good for: Consistency testing, failover simulation,
  latency benchmarking across zones

MULTI-CLUSTER (hybrid/multicloud):
================================
+-- Cluster 1 (GCP) --+   +-- Cluster 2 (On-Prem) --+
|  Zone A    Zone B    |   |  Zone C     Zone D      |
| [Leader]  [Replica]  |   | [Replica]  [Replica]    |
+----------------------+   +-------------------------+
            |                          |
            +---- Paxos across WAN ----+
  Good for: Cross-cloud consistency validation,
  air-gapped scenario testing (disconnected mode)

Step 3: Critical Configuration Checklist

Must configure before starting

NTP synchronization — TrueTime software equivalent requires accurate clocks. Configure NTP with ntpd or chronyd pointing to multiple upstream servers. Clock drift must stay under 10ms.
Storage path — Point --data-dir to an SSD-backed volume. HDD storage will cause commit latency spikes.
Port firewall — Open 9010 (gRPC), 9011 (HTTP) between all nodes. Multi-zone requires inter-zone connectivity.
Resource limits — Set container memory limits to at least 16GB. Spanner Omni’s Paxos layer is memory-intensive.

Preview gotchas to watch for

No TLS — All traffic is plaintext. Do not expose ports to the internet. Use a VPN or private network for multi-node setups.
90-day expiry — Set a calendar reminder. The instance will stop accepting writes after 90 days.
No backup — There is no built-in backup/restore. Use filesystem snapshots or pg_dump via PostgreSQL dialect as a workaround.
Single admin role — No role-based access. Anyone with the admin key can do anything. Guard credentials carefully.

Phase 2: Data Modeling and Query Testing

Step 4: Create a Multi-Model Schema

Its multi-model support means you define relational tables, vector columns, graph edges, and full-text indexes in one schema. Here is a concrete example for an AI document retrieval system:

Figure 7: Multi-model schema for AI document retrieval

RELATIONAL TABLE:
=================
CREATE TABLE documents (
  id          INT64 NOT NULL,
  title       STRING(500),
  body        STRING(MAX),
  author_id   INT64,
  published   TIMESTAMP,
  category    STRING(100)
) PRIMARY KEY (id);

VECTOR COLUMN (added to same table):
=====================================
ALTER TABLE documents
  ADD COLUMN embedding ARRAY;

FULL-TEXT INDEX:
================
CREATE SEARCH INDEX docs_body_index
  ON documents(body)
  OPTIONS (update_deadline_seconds = 300);

GRAPH SCHEMA (knowledge graph):
================================
CREATE TABLE authors (
  id       INT64 NOT NULL,
  name     STRING(200),
  org_id   INT64
) PRIMARY KEY (id);

CREATE TABLE citations (
  from_doc  INT64 NOT NULL,
  to_doc    INT64 NOT NULL,
  type      STRING(50)   -- 'references', 'contradicts', 'extends'
) PRIMARY KEY (from_doc, to_doc),
  INTERLEAVE IN PARENT documents ON DELETE CASCADE;

PROPERTY GRAPH:
================
CREATE PROPERTY GRAPH doc_graph
  NODE TABLES (documents, authors)
  EDGE TABLES (citations
    SOURCE KEY (from_doc) REFERENCES documents(id)
    DESTINATION KEY (to_doc) REFERENCES documents(id)
  );

Step 5: Load Vector Data and Run Hybrid Queries

Figure 8: Query execution flow — how Spanner Omni processes a hybrid vector + text + graph query

Example query: "Find recent AI papers about distributed consistency
             that cite the Spanner 2012 paper"

+---------------------------------------------------------------+
|                    SPANNER OMNI QUERY ENGINE                  |
|                                                               |
|  1. VECTOR SEARCH                                             |
|     +------------------+                                      |
|     | Embed query     |  ANN index scan                       |
|     | text -> vector  |  Top 100 candidates                   |
|     +--------+--------+                                      |
|              |                                                 |
|  2. FULL-TEXT FILTER                                          |
|     +--------v--------+                                      |
|     | FULLTEXT_MATCH   |  Re-rank by text relevance           |
|     | on body column   |  Narrow to 50 results                |
|     +--------+--------+                                      |
|              |                                                 |
|  3. GRAPH TRAVERSAL                                           |
|     +--------v--------+                                      |
|     | Graph query on   |  Follow citation edges               |
|     | doc_graph        |  Find papers citing Spanner 2012     |
|     +--------+--------+  Narrow to 12 results                |
|              |                                                 |
|  4. RELATIONAL FILTER                                         |
|     +--------v--------+                                      |
|     | WHERE published  |  Filter by date                      |
|     | > '2024-01-01'  |  Final: 8 results                    |
|     +--------+--------+                                      |
|              |                                                 |
|              v                                                 |
|     +------------------+                                      |
|     | Result set:      |                                      |
|     | 8 documents with |                                      |
|     | score + path +   |                                      |
|     | metadata         |                                      |
|     +------------------+                                      |
+---------------------------------------------------------------+

In a traditional stack, steps 1-4 run on 4 different systems
and the application must merge results in code.
The engine executes this as a single query plan.

Step 6: Benchmark What Matters

Do not benchmark generic CRUD. Test the patterns that differentiate it:

Figure 9: Benchmark matrix for Spanner Omni evaluation

+-------------------------+----------------+----------------+----------------+
| Benchmark               | What to measure| Target         | How to test    |
+-------------------------+----------------+----------------+----------------+
| Write latency           | p50, p99       | < 10ms p99     | INSERT loop    |
| (single-zone)           | commit time    |                | 10K writes     |
+-------------------------+----------------+----------------+----------------+
| Cross-zone write        | p99 with       | < 30ms p99     | Write from     |
| latency                 | Paxos round    |                | Zone A, read   |
|                         | trip           |                | from Zone B    |
+-------------------------+----------------+----------------+----------------+
| Vector ANN search       | Recall@10      | > 95%          | Known dataset  |
| (1M vectors, 128 dim)  | + latency      | < 50ms         | with ground    |
|                         |                |                | truth labels   |
+-------------------------+----------------+----------------+----------------+
| Hybrid query            | End-to-end     | < 100ms        | Vector + text  |
| (vector + text + graph) | latency        |                | + graph filter |
+-------------------------+----------------+----------------+----------------+
| Failover recovery       | Time to new    | < 30 seconds   | Kill leader    |
|                         | leader elected |                | node, measure  |
+-------------------------+----------------+----------------+----------------+
| Auto-resharding         | Time to        | < 5 minutes    | Bulk load data |
|                         | rebalance      |                | until split    |
+-------------------------+----------------+----------------+----------------+

IMPORTANT: Preview builds may not meet these targets.
Document actual numbers for your hardware and compare
against CockroachDB / YugabyteDB on same infrastructure.

Phase 3: Production Readiness Assessment

Step 7: Security Gap Analysis

Figure 10: Security feature comparison — preview vs production requirements

Figure 11: Decision tree — should your organization adopt Spanner Omni now?

                    Need distributed SQL in production NOW?
                              |
                 +-----------+-----------+
                 |                       |
                YES                     NO
                 |                       |
                 v                       v
        Use CockroachDB or        Need self-managed Spanner
        YugabyteDB Anywhere.      specifically?
        They have TLS, backup,           |
        encryption, RBAC today.    +-----+-----+
                                     |           |
                                    YES         NO
                                     |           |
                                     v           v
                            Can you wait for     Use managed Spanner
                            production release?   on GCP or AlloyDB
                                |                  Omni for Postgres.
                          +-----+-----+
                          |           |
                         YES         NO
                          |           |
                          v           v
                   Evaluate preview    Use CockroachDB /
                   for architecture    YugabyteDB now,
                   planning. Track     plan Spanner Omni
                   Google's release    migration later.
                   roadmap.

RECOMMENDATION FOR MOST ORGANIZATIONS:
--------------------------------------
1. NOW: CockroachDB or YugabyteDB for production
2. PARALLEL: Evaluate Spanner Omni preview for 2027 planning
3. LATER: Migrate when Google ships production-certified build

Research Path

Continue with the next decision points

AI Agents & Automation The Agentic Enterprise Stack: Why AI Agents Need an Operating System, Not Another Chatbot Enterprise AI Strategy Automation Cost Optimization Cheat Sheet Enterprise AI Strategy Digital Transformation in the Age of AI: A Methodology Built on What Actually Works (2026) Pillar AI research library Pillar Contact center AI architecture Pillar Digital transformation with AI Pillar Agentic data layer Pillar RAG in production Pillar Enterprise AI governance framework Pillar AI agent control plane Pillar Freight forwarding AI integration layer