Choose infrastructure for your generative AI application

Learn which products, frameworks, and tools are the best match for building your generative AI application. Common components in a Cloud-hosted generative AI application include:

Application hosting: Compute to host your application. Your application can use Google Cloud's client libraries and SDKs to talk to different Cloud products.
Model hosting: Scalable and secure hosting for a generative model.
Model: Generative model for text, chat, images, code, embeddings, and multimodal.
Grounding solution: Anchor model output to verifiable, updated sources of information.
Database: Store your application's data. You might reuse your existing database as your grounding solution, by augmenting prompts using SQL query, or by storing your data as vector embeddings using an extension like pgvector.
Storage: Store files such as images, videos, or static web frontends. You might also use Storage for the raw grounding data (eg. PDFs) that you later convert into embeddings and store in a vector database.

Diagram showing a high-level overview of a gen AI application hosting infrastructure, including a model and its model hosting infrastructure, grounding solution, database, storage, and application hosting.

The following sections walk through each of those components, helping you choose which Google Cloud products to try.

Application hosting infrastructure

Choose a product to host and serve your application workload, which makes calls out to the generative model.

Decision tree guiding users through the selection of an appropriate service for application hosting.

Get started with:

Model hosting infrastructure

Google Cloud provides multiple ways to host a generative model, from the flagship Vertex AI platform, to customizable and portable hosting on Google Kubernetes Engine.

Decision tree guiding users to choose the right model hosting cloud service based on their priorities and requirements.

Get started with:

Model

Google Cloud provides a set of state-of-the-art foundation models through Vertex AI, including Gemini. You can also deploy a third-party model to either Vertex AI Model Garden or self-host on GKE, Cloud Run, or Compute Engine.

Decision tree guiding users to choose a Vertex AI service, to generate text or code, with options for using text embeddings, images, or video.

Get started with:

Gemini
Codey
Imagen
text-embedding
Vertex AI Model Garden (open source models)
Hugging Face Model Hub (open source models)

Grounding

To ensure informed and accurate model responses, you may want to ground your generative AI application with real-time data. This is called retrieval-augmented generation (RAG).

You can implement grounding with your own data in a vector database, which is an optimal format for operations like similarity search. Google Cloud offers multiple vector database solutions, for different use cases.

Note: You can also ground with traditional (non vector) databases, by querying an existing database like Cloud SQL or Firestore, and using the result in your model prompt.

Decision tree guiding the user through choosing the right vector database solution for their needs.

Get started with:

Vertex AI Agent Builder (formerly Enterprise Search, Gen AI App Builder, Discovery Engine)
Vector Search (formerly Matching Engine)
AlloyDB for PostgreSQL
Cloud SQL
BigQuery

Grounding with APIs

Instead of (or in addition to) using your own data for grounding, many online services offer APIs that you can use to retrieve grounding data to augment your model prompt.

Start building

Set up your development environment for Google Cloud

Set up LangChain

LangChain is an open-source framework for generative AI apps that lets you build context into your prompts and take action based on the model's response.

View code samples and deploy sample applications

View code samples for popular use cases and deploy examples of generative AI applications that are secure, efficient, resilient, high-performing, and cost-effective.

Choose infrastructure for your generative AI application

Application hosting infrastructure

Model hosting infrastructure

Model

Grounding

Grounding with APIs

Vertex AI Extensions (Private Preview)

Langchain Components

Grounding in Vertex AI

Start building

Set up your development environment for Google Cloud

Set up LangChain

View code samples and deploy sample applications