Blog | ent

Building RAG systems in Go with Ent, Atlas, and pgvector

February 12, 2025 · 18 min read

In this blog post, we will explore how to build a RAG (Retrieval Augmented Generation) system using Ent, Atlas, and pgvector.

RAG is a technique that augments the power of generative models by incorporating a retrieval step. Instead of relying solely on the model’s internal knowledge, we can retrieve relevant documents or data from an external source and use that information to produce more accurate, context-aware responses. This approach is particularly useful when building applications such as question-answering systems, chatbots, or any scenario where up-to-date or domain-specific knowledge is needed.

Setting Up our Ent schema

Let's begin our tutorial by initializing the Go module which we will be using for our project:

go mod init github.com/rotemtam/entrag # Feel free to replace the module path with your own

In this project we will use Ent, an entity framework for Go, to define our database schema. The database will store the documents we want to retrieve (chunked to a fixed size) and the vectors representing each chunk. Initialize the Ent project by running the following command:

go run -mod=mod entgo.io/ent/cmd/ent new Embedding Chunk

This command creates placeholders for our data models. Our project should look like this:

├── ent
│   ├── generate.go
│   └── schema
│       ├── chunk.go
│       └── embedding.go
├── go.mod
└── go.sum

Next, let's define the schema for the Chunk model. Open the ent/schema/chunk.go file and define the schema as follows:

ent/schema/chunk.go
package schema

import (
    "entgo.io/ent"
    "entgo.io/ent/schema/edge"
    "entgo.io/ent/schema/field"
)

// Chunk holds the schema definition for the Chunk entity.
type Chunk struct {
    ent.Schema
}

// Fields of the Chunk.
func (Chunk) Fields() []ent.Field {
    return []ent.Field{
        field.String("path"),
        field.Int("nchunk"),
        field.Text("data"),
    }
}

// Edges of the Chunk.
func (Chunk) Edges() []ent.Edge {
    return []ent.Edge{
        edge.To("embedding", Embedding.Type).StorageKey(edge.Column("chunk_id")).Unique(),
    }
}

This schema defines a Chunk entity with three fields: path, nchunk, and data. The path field stores the path of the document, nchunk stores the chunk number, and data stores the chunked text data. We also define an edge to the Embedding entity, which will store the vector representation of the chunk.

Before we proceed, let's install the pgvector package. pgvector is a PostgreSQL extension that provides support for vector operations and similarity search. We will need it to store and retrieve the vector representations of our chunks.

go get github.com/pgvector/pgvector-go

Next, let's define the schema for the Embedding model. Open the ent/schema/embedding.go file and define the schema as follows:

ent/schema/embedding.go
package schema

import (
    "entgo.io/ent"
    "entgo.io/ent/dialect"
    "entgo.io/ent/dialect/entsql"
    "entgo.io/ent/schema/edge"
    "entgo.io/ent/schema/field"
    "entgo.io/ent/schema/index"
    "github.com/pgvector/pgvector-go"
)

// Embedding holds the schema definition for the Embedding entity.
type Embedding struct {
    ent.Schema
}

// Fields of the Embedding.
func (Embedding) Fields() []ent.Field {
    return []ent.Field{
        field.Other("embedding", pgvector.Vector{}).
            SchemaType(map[string]string{
                dialect.Postgres: "vector(1536)",
            }),
    }
}

// Edges of the Embedding.
func (Embedding) Edges() []ent.Edge {
    return []ent.Edge{
        edge.From("chunk", Chunk.Type).Ref("embedding").Unique().Required(),
    }
}

func (Embedding) Indexes() []ent.Index {
    return []ent.Index{
        index.Fields("embedding").
            Annotations(
                entsql.IndexType("hnsw"),
                entsql.OpClass("vector_l2_ops"),
            ),
    }
}

This schema defines an Embedding entity with a single field embedding of type pgvector.Vector. The embedding field stores the vector representation of the chunk. We also define an edge to the Chunk entity and an index on the embedding field using the hnsw index type and vector_l2_ops operator class. This index will enable us to perform efficient similarity searches on the embeddings.

Finally, let's generate the Ent code by running the following commands:

go mod tidy
go generate ./...

Ent will generate the necessary code for our models based on the schema definitions.

Setting Up the database

Next, let's set up the PostgreSQL database. We will use Docker to run a PostgreSQL instance locally. As we need the pgvector extension, we will use the pgvector/pgvector:pg17 Docker image, which comes with the extension pre-installed.

docker run --rm --name postgres -e POSTGRES_PASSWORD=pass -p 5432:5432 -d pgvector/pgvector:pg17

We will be using Atlas, a database schema-as-code tool that integrates with Ent, to manage our database schema. Install Atlas by running the following command:

curl -sSfL https://atlasgo.io/install.sh | sh

For other installation options, see the Atlas installation docs.

As we are going to managing extensions, we need an Atlas Pro account. You can sign up for a free trial by running:

atlas login

Working without a migration tool

If you would like to skip using Atlas, you can apply the required schema directly to the database using the statements in this file

Now, let's create our base configuration base.pg.hcl which provides the vector extension for the public schema:

base.pg.hcl
schema "public" {
}

extension "vector" {
  schema = schema.public
}

Now, let's create our Atlas configuration which composes the base.pg.hcl file with the Ent schema:

atlas.hcl
data "composite_schema" "schema" {
  schema {
    url = "file://base.pg.hcl"
  }
  schema "public" {
    url = "ent://ent/schema"
  }
}

env "local" {
  url = getenv("DB_URL")
  schema {
    src = data.composite_schema.schema.url
  }
  dev = "docker://pgvector/pg17/dev"
}

This configuration defines a composite schema that includes the base.pg.hcl file and the Ent schema. We also define an environment named local that uses the composite schema which we will use for local development. The dev field specifies the Dev Database URL, which is used by Atlas to normalize schemas and make various calculations.

Next, let's apply the schema to the database by running the following command:

export DB_URL='postgresql://postgres:pass@localhost:5432/postgres?sslmode=disable'
atlas schema apply --env local

Atlas will load the desired state of the database from our configuration, compare it to the current state of the database, and create a migration plan to bring the database to the desired state:

Planning migration statements (5 in total):

  -- create extension "vector":
    -> CREATE EXTENSION "vector" WITH SCHEMA "public" VERSION "0.8.0";
  -- create "chunks" table:
    -> CREATE TABLE "public"."chunks" (
         "id" bigint NOT NULL GENERATED BY DEFAULT AS IDENTITY,
         "path" character varying NOT NULL,
         "nchunk" bigint NOT NULL,
         "data" text NOT NULL,
         PRIMARY KEY ("id")
       );
  -- create "embeddings" table:
    -> CREATE TABLE "public"."embeddings" (
         "id" bigint NOT NULL GENERATED BY DEFAULT AS IDENTITY,
         "embedding" public.vector(1536) NOT NULL,
         "chunk_id" bigint NOT NULL,
         PRIMARY KEY ("id"),
         CONSTRAINT "embeddings_chunks_embedding" FOREIGN KEY ("chunk_id") REFERENCES "public"."chunks" ("id") ON UPDATE NO ACTION ON DELETE NO ACTION
       );
  -- create index "embedding_embedding" to table: "embeddings":
    -> CREATE INDEX "embedding_embedding" ON "public"."embeddings" USING hnsw ("embedding" vector_l2_ops);
  -- create index "embeddings_chunk_id_key" to table: "embeddings":
    -> CREATE UNIQUE INDEX "embeddings_chunk_id_key" ON "public"."embeddings" ("chunk_id");

-------------------------------------------

Analyzing planned statements (5 in total):

  -- non-optimal columns alignment:
    -- L4: Table "chunks" has 8 redundant bytes of padding per row. To reduce disk space,
       the optimal order of the columns is as follows: "id", "nchunk", "path",
       "data" https://atlasgo.io/lint/analyzers#PG110
  -- ok (370.25µs)

  -------------------------
  -- 114.306667ms
  -- 5 schema changes
  -- 1 diagnostic

-------------------------------------------

? Approve or abort the plan:
  ▸ Approve and apply
    Abort

In addition to planning the change, Atlas will also provide diagnostics and suggestions for optimizing the schema. In this case it suggests reordering the columns in the chunks table to reduce disk space. As we are not concerned with disk space in this tutorial, we can proceed with the migration by selecting Approve and apply.

Finally, we can verify that our schema was applied successfully, we can re-run the atlas schema apply command. Atlas will output:

Schema is synced, no changes to be made

Scaffolding the CLI

Now that our database schema is set up, let's scaffold our CLI application. For this tutorial, we will be using the alecthomas/kong library to build a small app that can load, index and query the documents in our database.

First, install the kong library:

go get github.com/alecthomas/kong

Next, create a new file named cmd/entrag/main.go and define the CLI application as follows:

cmd/entrag/main.go
package main

import (
    "fmt"
    "os"

    "github.com/alecthomas/kong"
)

// CLI holds global options and subcommands.
type CLI struct {
    // DBURL is read from the environment variable DB_URL.
    DBURL     string `kong:"env='DB_URL',help='Database URL for the application.'"`
    OpenAIKey string `kong:"env='OPENAI_KEY',help='OpenAI API key for the application.'"`

    // Subcommands
    Load  *LoadCmd  `kong:"cmd,help='Load command that accepts a path.'"`
    Index *IndexCmd `kong:"cmd,help='Create embeddings for any chunks that do not have one.'"`
    Ask   *AskCmd   `kong:"cmd,help='Ask a question about the indexed documents'"`
}

func main() {
    var cli CLI
    app := kong.Parse(&cli,
        kong.Name("entrag"),
        kong.Description("Ask questions about markdown files."),
        kong.UsageOnError(),
    )
    if err := app.Run(&cli); err != nil {
        fmt.Fprintf(os.Stderr, "Error: %s\n", err)
        os.Exit(1)
    }
}

Create an additional file named cmd/entrag/rag.go with the following content:

cmd/entrag/rag.go
package main

type (
    // LoadCmd loads the markdown files into the database.
    LoadCmd struct {
        Path string `help:"path to dir with markdown files" type:"existingdir" required:""`
    }
    // IndexCmd creates the embedding index on the database.
    IndexCmd struct {
    }
    // AskCmd is another leaf command.
    AskCmd struct {
        // Text is the positional argument for the ask command.
        Text string `kong:"arg,required,help='Text for the ask command.'"`
    }
)

Verify our scaffolded CLI application works by running:

go run ./cmd/entrag --help

If everything is set up correctly, you should see the help output for the CLI application:

Usage: entrag <command> [flags]

Ask questions about markdown files.

Flags:
  -h, --help                  Show context-sensitive help.
      --dburl=STRING          Database URL for the application ($DB_URL).
      --open-ai-key=STRING    OpenAI API key for the application ($OPENAI_KEY).

Commands:
  load --path=STRING [flags]
    Load command that accepts a path.

  index [flags]
    Create embeddings for any chunks that do not have one.

  ask <text> [flags]
    Ask a question about the indexed documents

Run "entrag <command> --help" for more information on a command.

Load the documents into the database

Next, we need some markdown files to load into the database. Create a directory named data and add some markdown files to it. For this example, I downloaded the ent/ent repository and used the docs directory as the source of markdown files.

Now, let's implement the LoadCmd command to load the markdown files into the database. Open the cmd/entrag/rag.go file and add the following code:

cmd/entrag/rag.go
const (
    tokenEncoding = "cl100k_base"
    chunkSize     = 1000
)

// Run is the method called when the "load" command is executed.
func (cmd *LoadCmd) Run(ctx *CLI) error {
    client, err := ctx.entClient()
    if err != nil {
        return fmt.Errorf("failed opening connection to postgres: %w", err)
    }
    tokTotal := 0
    return filepath.WalkDir(ctx.Load.Path, func(path string, d fs.DirEntry, err error) error {
        if filepath.Ext(path) == ".mdx" || filepath.Ext(path) == ".md" {
            chunks := breakToChunks(path)
            for i, chunk := range chunks {
                tokTotal += len(chunk)
                client.Chunk.Create().
                    SetData(chunk).
                    SetPath(path).
                    SetNchunk(i).
                    SaveX(context.Background())
            }
        }
        return nil
    })
}

func (c *CLI) entClient() (*ent.Client, error) {
    return ent.Open("postgres", c.DBURL)
}

This code defines the Run method for the LoadCmd command. The method reads the markdown files from the specified path, breaks them into chunks of 1000 tokens each, and saves them to the database. We use the entClient method to create a new Ent client using the database URL specified in the CLI options.

For the implementation of breakToChunks, see the full code in the entrag repository, which is based almost entirely on Eli Bendersky's intro to RAG in Go.

Finally, let's run the load command to load the markdown files into the database:

go run ./cmd/entrag load --path=data

After the command completes, you should see the chunks loaded into the database. To verify run:

docker exec -it postgres psql -U postgres -d postgres -c "SELECT COUNT(*) FROM chunks;"

You should see something similar to:

  count
-------
   276
(1 row)

Indexing the embeddings

Now that we have loaded the documents into the database, we need to create embeddings for each chunk. We will use the OpenAI API to generate embeddings for the chunks. To do this, we need to install the openai package:

go get github.com/sashabaranov/go-openai

If you do not have an OpenAI API key, you can sign up for an account on the OpenAI Platform and generate an API key.

We will be reading this key from the environment variable OPENAI_KEY, so let's set it:

export OPENAI_KEY=<your OpenAI API key>

Next, let's implement the IndexCmd command to create embeddings for the chunks. Open the cmd/entrag/rag.go file and add the following code:

cmd/entrag/rag.go
// Run is the method called when the "index" command is executed.
func (cmd *IndexCmd) Run(cli *CLI) error {
    client, err := cli.entClient()
    if err != nil {
        return fmt.Errorf("failed opening connection to postgres: %w", err)
    }
    ctx := context.Background()
    chunks := client.Chunk.Query().
        Where(
            chunk.Not(
                chunk.HasEmbedding(),
            ),
        ).
        Order(ent.Asc(chunk.FieldID)).
        AllX(ctx)
    for _, ch := range chunks {
        log.Println("Created embedding for chunk", ch.Path, ch.Nchunk)
        embedding := getEmbedding(ch.Data)
        _, err := client.Embedding.Create().
            SetEmbedding(pgvector.NewVector(embedding)).
            SetChunk(ch).
            Save(ctx)
        if err != nil {
            return fmt.Errorf("error creating embedding: %v", err)
        }
    }
    return nil
}

// getEmbedding invokes the OpenAI embedding API to calculate the embedding
// for the given string. It returns the embedding.
func getEmbedding(data string) []float32 {
    client := openai.NewClient(os.Getenv("OPENAI_KEY"))
    queryReq := openai.EmbeddingRequest{
        Input: []string{data},
        Model: openai.AdaEmbeddingV2,
    }
    queryResponse, err := client.CreateEmbeddings(context.Background(), queryReq)
    if err != nil {
        log.Fatalf("Error getting embedding: %v", err)
    }
    return queryResponse.Data[0].Embedding
}

We have defined the Run method for the IndexCmd command. The method queries the database for chunks that do not have embeddings, generates embeddings for each chunk using the OpenAI API, and saves the embeddings to the database.

Finally, let's run the index command to create embeddings for the chunks:

go run ./cmd/entrag index

You should see logs similar to:

2025/02/13 13:04:42 Created embedding for chunk /Users/home/entr/data/md/aggregate.md 0
2025/02/13 13:04:43 Created embedding for chunk /Users/home/entr/data/md/ci.mdx 0
2025/02/13 13:04:44 Created embedding for chunk /Users/home/entr/data/md/ci.mdx 1
2025/02/13 13:04:45 Created embedding for chunk /Users/home/entr/data/md/ci.mdx 2
2025/02/13 13:04:46 Created embedding for chunk /Users/home/entr/data/md/code-gen.md 0
2025/02/13 13:04:47 Created embedding for chunk /Users/home/entr/data/md/code-gen.md 1

Asking questions

Now that we have loaded the documents and created embeddings for the chunks, we can implement the AskCmd command to ask questions about the indexed documents. Open the cmd/entrag/rag.go file and add the following code:

cmd/entrag/rag.go
// Run is the method called when the "ask" command is executed.
func (cmd *AskCmd) Run(ctx *CLI) error {
    client, err := ctx.entClient()
    if err != nil {
        return fmt.Errorf("failed opening connection to postgres: %w", err)
    }
    question := cmd.Text
    emb := getEmbedding(question)
    embVec := pgvector.NewVector(emb)
    embs := client.Embedding.
        Query().
        Order(func(s *sql.Selector) {
            s.OrderExpr(sql.ExprP("embedding <-> $1", embVec))
        }).
        WithChunk().
        Limit(5).
        AllX(context.Background())
    b := strings.Builder{}
    for _, e := range embs {
        chnk := e.Edges.Chunk
        b.WriteString(fmt.Sprintf("From file: %v\n", chnk.Path))
        b.WriteString(chnk.Data)
    }
    query := fmt.Sprintf(`Use the below information from the ent docs to answer the subsequent question.
Information:
%v

Question: %v`, b.String(), question)
    oac := openai.NewClient(ctx.OpenAIKey)
    resp, err := oac.CreateChatCompletion(
        context.Background(),
        openai.ChatCompletionRequest{
            Model: openai.GPT4o,
            Messages: []openai.ChatCompletionMessage{

                {
                    Role:    openai.ChatMessageRoleUser,
                    Content: query,
                },
            },
        },
    )
    if err != nil {
        return fmt.Errorf("error creating chat completion: %v", err)
    }
    choice := resp.Choices[0]
    out, err := glamour.Render(choice.Message.Content, "dark")
    fmt.Print(out)
    return nil
}

This is where all of the parts come together. After preparing our database with the documents and their embeddings, we can now ask questions about them. Let's break down the AskCmd command:

emb := getEmbedding(question)
embVec := pgvector.NewVector(emb)
embs := client.Embedding.
    Query().
    Order(func(s *sql.Selector) {
        s.OrderExpr(sql.ExprP("embedding <-> $1", embVec))
    }).
    WithChunk().
    Limit(5).
    AllX(context.Background())

We begin by transforming the user's question into a vector using the OpenAI API. Using this vector we would like to find the most similar embeddings in our database. We query the database for the embeddings, order them by similarity using pgvector's <-> operator, and limit the results to the top 5.

for _, e := range embs {
        chnk := e.Edges.Chunk
        b.WriteString(fmt.Sprintf("From file: %v\n", chnk.Path))
        b.WriteString(chnk.Data)
    }
    query := fmt.Sprintf(`Use the below information from the ent docs to answer the subsequent question.
Information:
%v

Question: %v`, b.String(), question)

Next, we prepare the information from the top 5 chunks to be used as context for the question. We then format the question and the context into a single string.

oac := openai.NewClient(ctx.OpenAIKey)
resp, err := oac.CreateChatCompletion(
    context.Background(),
    openai.ChatCompletionRequest{
        Model: openai.GPT4o,
        Messages: []openai.ChatCompletionMessage{

            {
                Role:    openai.ChatMessageRoleUser,
                Content: query,
            },
        },
    },
)
if err != nil {
    return fmt.Errorf("error creating chat completion: %v", err)
}
choice := resp.Choices[0]
out, err := glamour.Render(choice.Message.Content, "dark")
fmt.Print(out)

Then, we use the OpenAI API to generate a response to the question. We pass the question and context to the API and receive a response. We then render the response using the glamour package to display it in the terminal.

Before running the ask command, let's install the glamour package:

go get github.com/charmbracelet/glamour

Finally, let's run the ask command to ask a question about the indexed documents:

go run ./cmd/entrag ask "tl;dr What is Ent?"

And our RAG system responds:

  Ent is an open-source entity framework (ORM) for the Go programming language. It
  allows developers to define data models or graph-structures in Go code. Ent
  emphasizes principles such as schema as code, a statically typed and explicit
  API generated through codegen, simple queries and graph traversals, statically
  typed predicates, and storage agnosticism. It supports various databases,
  including MySQL, MariaDB, PostgreSQL, SQLite, and Gremlin-based graph databases,
  and aims to enhance productivity in Go development.

Amazing! We have successfully built a RAG system using Ent, Atlas, and pgvector. We can now ask questions about the documents we loaded into the database and receive context-aware responses.

Here are some more Q&As with our nifty RAG system:

go run ./cmd/entrag ask "how to define order field in entgql"

To define an order field in entgql , you need to annotate a field in your Ent schema using the entgql.Annotation with the OrderField . Here's a step-by- step guide on how you can do it:

Choose a Comparable Field: Select the field in your schema that you want to be orderable. This could be a text field, a timestamp, an integer, an enum, etc.
Annotate the Field: Use the entgql.OrderField annotation on the chosen field. The order field name should be uppercase and match its corresponding enum value in the GraphQL schema.

Update your Schema: Here’s an example of how to annotate your fields in the ent/schema :

func (Todo) Fields() []ent.Field {
    return []ent.Field{
        field.Text("text").
            NotEmpty().
            Annotations(
                entgql.OrderField("TEXT"),
            ),
        field.Time("created_at").
            Default(time.Now).
            Immutable().
            Annotations(
                entgql.OrderField("CREATED_AT"),
            ),
        field.Enum("status").
            NamedValues(
                "InProgress", "IN_PROGRESS",
                "Completed", "COMPLETED",
            ).
            Default("IN_PROGRESS").
            Annotations(
                entgql.OrderField("STATUS"),
            ),
        field.Int("priority").
            Default(0).
            Annotations(
                entgql.OrderField("PRIORITY"),
            ),
    }
}

Alternatively, Multiple Order Fields: If you want to enable ordering by multiple fields, you can use the entgql.MultiOrder() annotation:
```
func (Todo) Annotations() []schema.Annotation {
    return []schema.Annotation{
        entgql.MultiOrder(),
    }
}
```

Generate GraphQL Types: Ensure that the GraphQL types match this setup in your GraphQL schema, for example:

enum OrderDirection {
  ASC
  DESC
}
enum TodoOrderField {
  CREATED_AT
  PRIORITY
  STATUS
  TEXT
}
input TodoOrder {
  direction: OrderDirection!
  field: TodoOrderField
}

Add orderBy Argument to Queries: Include the orderBy parameter in your GraphQL queries to allow clients to sort items:
```
type Query {
  todos(
    after: Cursor
    first: Int
    before: Cursor
    last: Int
    orderBy: TodoOrder
  ): TodoConnection!
}
```
By following these steps, you ensure that your Ent-based application can support sorting in your GraphQL API based on various fields.

go run ./cmd/entrag ask "what's the difference between privacy rules and interceptors"

Privacy rules and interceptors serve different purposes within the Ent framework, each providing distinct functionalities:

Privacy Rules:

Purpose: Privacy rules are primarily concerned with enforcing access control policies for data queries and mutations. They determine whether an operation is allowed or denied based on predefined rules.
Implementation: They are implemented as part of a privacy policy using the ent.Policy interface which includes EvalQuery and EvalMutation methods.
Operation: Privacy rules evaluate whether specific conditions are met before allowing or denying access to the data. They can return decisions like privacy.Allow , privacy.Deny , or privacy.Skip to control the flow of evaluation.
Use Case: Ideal for managing access control by ensuring that users can only perform certain operations if they meet the specified criteria.

Interceptors:

Purpose: Interceptors act as middleware for Ent queries, allowing modification and customization of query behaviors. They can be used to augment or modify queries during different stages of their lifecycle.
Implementation: Implemented as interfaces or using the ent.InterceptFunc adapter. They intercept and potentially modify queries by working on the read-path.
Operation: Interceptors modify or enhance queries, typically without the access control logic inherent in privacy rules. They provide hooks to execute custom logic pre and post query execution.
Use Case: Suitable for generic transformations or modifications to queries, such as adding default filters, query limitations, or logging operations without focusing on access control.

In summary, while privacy rules focus on access control, interceptors are about managing and modifying the query execution process.

Wrapping up

In this blog post, we explored how to build a RAG system using Ent, Atlas, and pgvector. Special thanks to Eli Bendersky for the informative blog post and for his great Go writing over the years!

Quickly Generate ERDs from your Ent Schemas (Updated)

August 13, 2023 · 3 min read

Rotem Tamir

TL;DR

Create a visualization of your Ent schema with one command:

atlas schema inspect \
  -u ent://ent/schema \
  --dev-url "sqlite://demo?mode=memory&_fk=1" \
  --visualize

Hi Everyone!

A few months ago, we shared entviz, a cool tool that enables you to visualize your Ent schemas. Due to its success and popularity, we decided to integrate it directly into Atlas, the migration engine that Ent uses.

Since the release of v0.13.0 of Atlas, you can now visualize your Ent schemas directly from Atlas without needing to install an additional tool.

Private vs. Public Visualizations

Previously, you could only share a visualization of your schema to the Atlas Public Playground. While this is convenient for sharing your schema with others, it is not acceptable for many teams who maintain schemas that themselves are sensitive and cannot be shared publicly.

With this new release, you can easily publish your schema directly to your private workspace on Atlas Cloud. This means that only you and your team can access the visualization of your schema.

Visualizing your Ent Schema with Atlas

To visualize your Ent schema with Atlas, first install its latest version:

curl -sSfL https://atlasgo.io/install.sh | sh

For other installation options, see the Atlas installation docs.

Next, run the following command to generate a visualization of your Ent schema:

atlas schema inspect \
  -u ent://ent/schema \
  --dev-url "sqlite://demo?mode=memory&_fk=1" \
  --visualize

Let's break this command down:

atlas schema inspect - this command can be used to inspect schemas from a variety of sources and outputs them in various formats. In this case, we are using it to inspect an Ent schema.
-u ent://ent/schema - this is the URL to the Ent schema we want to inspect. In this case, we are using the ent:// schema loader to point to a local Ent schema in the ./ent/schema directory.
--dev-url "sqlite://demo?mode=memory&_fk=1" - Atlas relies on having an empty database called the Dev Database to normalize schemas and make various calculations. In this case, we are using an in memory SQLite database; but, if you are using a different driver, you can use docker://mysql/8/dev (for MySQL) or docker://postgres/15/?search_path=public (for PostgreSQL).

Once you run this command, you should see the following output:

Use the arrow keys to navigate: ↓ ↑ → ←
? Where would you like to share your schema visualization?:
  ▸ Publicly (gh.atlasgo.cloud)
    Your personal workspace (requires 'atlas login')

If you want to share your schema publicly, you can select the first option. If you want to share it privately, you can select the second option and then run atlas login to log in to your (free) Atlas account.

Wrapping up

In this post, we showed how you can easily visualize your Ent schema with Atlas. We hope you find this feature useful and we look forward to hearing your feedback!

For more Ent news and updates:

Subscribe to our Newsletter
Follow us on Twitter
Join us on #ent on the Gophers Slack
Join us on the Ent Discord Server

A beginner's guide to creating a web-app in Go using Ent

February 23, 2023 · 19 min read

Rotem Tamir

Ent is an open-source entity framework for Go. It is similar to more traditional ORMs, but has a few distinct features that have made it very popular in the Go community. Ent was first open-sourced by Ariel in 2019, when he was working at Facebook. Ent grew from the pains of managing the development of applications with very large and complex data models and ran successfully inside Facebook for a year before open-sourcing it. After graduating from Facebook Open Source, Ent joined the Linux Foundation in September 2021.

This tutorial is intended for Ent and Go novices who want to start by building a simple project: a very minimal content management system.

Over the last few years, Ent has become one of the fastest growing ORMs in Go:

Source: @ossinsight_bot on Twitter, November 2022

Some of Ent's most cited features are:

A type-safe Go API for working with your database. Forget about using interface{} or reflection to work with your database. Use pure Go that your editor understands and your compiler enforces.

Model your data in graph semantics - Ent uses graph semantics to model your application's data. This makes it very easy to traverse complex datasets in a simple API.

Let’s say we want to get all users that are in groups that are about dogs. Here are two ways to write something like this with Ent:

// Start traversing from the topic.
client.Topic.Query().
    Where(topic.Name("dogs")).
    QueryGroups().
    QueryUsers().
    All(ctx)

// OR: Start traversing from the users and filter.
client.User.Query().
    Where(      
        user.HasGroupsWith(  
            group.HasTopicsWith(  
                topic.Name("dogs"),  
            ),  
        ),
    ).
    All(ctx)

Automatically generate servers - whether you need GraphQL, gRPC or an OpenAPI compliant API layer, Ent can generate the necessary code you need to create a performant server on top of your database. Ent will generate both the third-party schemas (GraphQL types, Protobuf messages, etc.) and optimized code for the repetitive tasks for reading and writing from the database.
Bundled with Atlas - Ent is built with a rich integration with Atlas, a robust schema management tool with many advanced capabilities. Atlas can automatically plan schema migrations for you as well as verify them in CI or deploy them to production for you. (Full disclosure: Ariel and I are the creators and maintainers)

Prerequisites

Supporting repo

You can find of the code shown in this tutorial in this repo.

Step 1: Setting up the database schema

You can find the code described in this step in this commit.

Let's start by initializing our project using go mod init:

go mod init github.com/rotemtam/ent-blog-example

Go confirms our new module was created:

go: creating new go.mod: module github.com/rotemtam/ent-blog-example

The first thing we will handle in our demo project will be to setup our database. We create our application data model using Ent. Let's fetch it using go get:

go get -u entgo.io/ent@master

Once installed, we can use the Ent CLI to initialize the models for the two types of entities we will be dealing with in this tutorial: the User and the Post.

go run -mod=mod entgo.io/ent/cmd/ent new User Post

Notice that a few files are created:

.
`-- ent
    |-- generate.go
    `-- schema
        |-- post.go
        `-- user.go

2 directories, 3 files

Ent created the basic structure for our project:

generate.go - we will see in a bit how this file is used to invoke Ent's code-generation engine.
The schema directory, with a bare ent.Schema for each of the entities we requested.

Let's continue by defining the schema for our entities. This is the schema definition for User:

// Fields of the User.  
func (User) Fields() []ent.Field {  
   return []ent.Field{  
      field.String("name"),  
      field.String("email").
            Unique(),  
      field.Time("created_at").
            Default(time.Now),  
   }  
}  
  
// Edges of the User.  
func (User) Edges() []ent.Edge {  
   return []ent.Edge{  
      edge.To("posts", Post.Type),  
   }  
}

Observe that we defined three fields, name, email and created_at (which takes the default value of time.Now()). Since we expect emails to be unique in our system we added that constraint on the email field. In addition, we defined an edge named posts to the Post type. Edges are used in Ent to define relationships between entities. When working with a relational database, edges are translated into foreign keys and association tables.

// Post holds the schema definition for the Post entity.
type Post struct {  
   ent.Schema  
}  
  
// Fields of the Post.  
func (Post) Fields() []ent.Field {  
   return []ent.Field{  
      field.String("title"),  
      field.Text("body"),  
      field.Time("created_at").
            Default(time.Now),  
   }  
}  
  
// Edges of the Post.
func (Post) Edges() []ent.Edge {  
   return []ent.Edge{  
      edge.From("author", User.Type).  
         Unique().  
         Ref("posts"),  
   }  
}

On the Post schema, we defined three fields as well: title, body and created_at. In addition, we defined an edge named author from Post to the User entity. We marked this edge as Unique because in our budding system, each post can have only one author. We used Ref to tell Ent that this edge's back reference is the posts edge on the User.

Ent's power stems from it's code-generation engine. When developing with Ent, whenever we make any change to our application schema, we must invoke Ent's code-gen engine to regenerate our database access code. This is what allows Ent to maintain a type-safe and efficient Go API for us.

Let's see this in action, run:

go generate ./...

Observe that a whole lot of new Go files were created for us:

.
`-- ent
    |-- client.go
    |-- context.go
    |-- ent.go
    |-- enttest
    |   `-- enttest.go
/// .. Truncated for brevity 
    |-- user_query.go
    `-- user_update.go

9 directories, 29 files

info

If you're interested to see what the actual database schema for our application looks like, you can use a useful tool called entviz:

go run -mod=mod ariga.io/entviz ./ent/schema

To view the result, click here.

Once we have our data model defined, let's create the database schema for it.

To install the latest release of Atlas, simply run one of the following commands in your terminal, or check out the Atlas website:

macOS + Linux
Homebrew
Go
Docker
Windows

curl -sSf https://atlasgo.sh | sh

brew install ariga/tap/atlas

go install ariga.io/atlas/cmd/atlas@master

docker pull arigaio/atlas
docker run --rm arigaio/atlas --help

If the container needs access to the host network or a local directory, use the --net=host flag and mount the desired directory:

docker run --rm --net=host \
  -v $(pwd)/migrations:/migrations \
  arigaio/atlas migrate apply
  --url "mysql://root:pass@:3306/test"

With Atlas installed, we can create the initial migration script:

atlas migrate diff add_users_posts \
  --dir "file://ent/migrate/migrations" \
  --to "ent://ent/schema" \
  --dev-url "docker://mysql/8/ent"

Observe that two new files were created:

ent/migrate/migrations
|-- 20230226150934_add_users_posts.sql
`-- atlas.sum

The SQL file (the actual file name will vary on your machine depending on the timestamp in which you run atlas migrate diff) contains the SQL DDL statements required to set up the database schema on an empty MySQL database:

-- create "users" table  
CREATE TABLE `users` (`id` bigint NOT NULL AUTO_INCREMENT, `name` varchar(255) NOT NULL, `email` varchar(255) NOT NULL, `created_at` timestamp NOT NULL, PRIMARY KEY (`id`), UNIQUE INDEX `email` (`email`)) CHARSET utf8mb4 COLLATE utf8mb4_bin;  
-- create "posts" table  
CREATE TABLE `posts` (`id` bigint NOT NULL AUTO_INCREMENT, `title` varchar(255) NOT NULL, `body` longtext NOT NULL, `created_at` timestamp NOT NULL, `user_posts` bigint NULL, PRIMARY KEY (`id`), INDEX `posts_users_posts` (`user_posts`), CONSTRAINT `posts_users_posts` FOREIGN KEY (`user_posts`) REFERENCES `users` (`id`) ON UPDATE NO ACTION ON DELETE SET NULL) CHARSET utf8mb4 COLLATE utf8mb4_bin;

To setup our development environment, let's use Docker to run a local mysql container:

docker run --rm --name entdb -d -p 3306:3306 -e MYSQL_DATABASE=ent -e MYSQL_ROOT_PASSWORD=pass mysql:8

Finally, let's run the migration script on our local database:

atlas migrate apply --dir file://ent/migrate/migrations \
  --url mysql://root:pass@localhost:3306/ent

Atlas reports that it successfully created the tables:

Migrating to version 20230220115943 (1 migrations in total):

  -- migrating version 20230220115943
    -> CREATE TABLE `users` (`id` bigint NOT NULL AUTO_INCREMENT, `name` varchar(255) NOT NULL, `email` varchar(255) NOT NULL, `created_at` timestamp NOT NULL, PRIMARY KEY (`id`), UNIQUE INDEX `email` (`email`)) CHARSET utf8mb4 COLLATE utf8mb4_bin;
    -> CREATE TABLE `posts` (`id` bigint NOT NULL AUTO_INCREMENT, `title` varchar(255) NOT NULL, `body` longtext NOT NULL, `created_at` timestamp NOT NULL, `post_author` bigint NULL, PRIMARY KEY (`id`), INDEX `posts_users_author` (`post_author`), CONSTRAINT `posts_users_author` FOREIGN KEY (`post_author`) REFERENCES `users` (`id`) ON UPDATE NO ACTION ON DELETE SET NULL) CHARSET utf8mb4 COLLATE utf8mb4_bin;
  -- ok (55.972329ms)

  -------------------------
  -- 67.18167ms
  -- 1 migrations 
  -- 2 sql statements

Step 2: Seeding our database

info

The code for this step can be found in this commit.

While we are developing our content management system, it would be sad to load a web page for our system and not see content for it. Let's start by seeding data into our database and learn some Ent concepts.

To access our local MySQL database, we need a driver for it, use go get to fetch it:

go get -u github.com/go-sql-driver/mysql

Create a file named main.go and add this basic seeding script.

package main

import (
    "context"
    "flag"
    "fmt"
    "log"

    "github.com/rotemtam/ent-blog-example/ent"

    _ "github.com/go-sql-driver/mysql"
    "github.com/rotemtam/ent-blog-example/ent/user"
)

func main() {
    // Read the connection string to the database from a CLI flag.
    var dsn string
    flag.StringVar(&dsn, "dsn", "", "database DSN")
    flag.Parse()

    // Instantiate the Ent client.
    client, err := ent.Open("mysql", dsn)
    if err != nil {
        log.Fatalf("failed connecting to mysql: %v", err)
    }
    defer client.Close()

    ctx := context.Background()
    // If we don't have any posts yet, seed the database.
    if !client.Post.Query().ExistX(ctx) {
        if err := seed(ctx, client); err != nil {
            log.Fatalf("failed seeding the database: %v", err)
        }
    }
    // ... Continue with server start.
}

func seed(ctx context.Context, client *ent.Client) error {
    // Check if the user "rotemtam" already exists.
    r, err := client.User.Query().
        Where(
            user.Name("rotemtam"),
        ).
        Only(ctx)
    switch {
    // If not, create the user.
    case ent.IsNotFound(err):
        r, err = client.User.Create().
            SetName("rotemtam").
            SetEmail("r@hello.world").
            Save(ctx)
        if err != nil {
            return fmt.Errorf("failed creating user: %v", err)
        }
    case err != nil:
        return fmt.Errorf("failed querying user: %v", err)
    }
    // Finally, create a "Hello, world" blogpost.
    return client.Post.Create().
        SetTitle("Hello, World!").
        SetBody("This is my first post").
        SetAuthor(r).
        Exec(ctx)
}

As you can see, this program first checks if any Post entity exists in the database, if it does not it invokes the seed function. This function uses Ent to retrieve the user named rotemtam from the database and in case it does not exist, tries to create it. Finally, the function creates a blog post with this user as its author.

Run it:

 go run main.go -dsn "root:pass@tcp(localhost:3306)/ent?parseTime=true"

Step 3: Creating the home page

info

The code described in this step can be found in this commit

Let's now create the home page of the blog. This will consist of a few parts:

The view - this is a Go html/template that renders the actual HTML the user will see.
The server code - this contains the HTTP request handlers that our users' browsers will communicate with and will render our templates with data they retrieve from the database.
The router - registers different paths to handlers.
A unit test - to verify our server behaves correctly.

The view

Go has an excellent templating engine that comes in two flavors: text/template for rendering general purpose text and html/template which had some extra security features to prevent code injection when working with HTML documents. Read more about it here .

Let's create our first template that will be used to display a list of blog posts. Create a new file named templates/list.tmpl:

<html>
<head>
    <title>My Blog</title>
    <link href="https://cdn.jsdelivr.net/npm/bootstrap@5.3.0-alpha1/dist/css/bootstrap.min.css" rel="stylesheet"
          integrity="sha384-GLhlTQ8iRABdZLl6O3oVMWSktQOp6b7In1Zl3/Jr59b6EGGoI1aFkw7cmDA6j6gD" crossorigin="anonymous">

</head>
<body>
<div class="col-lg-8 mx-auto p-4 py-md-5">
    <header class="d-flex align-items-center pb-3 mb-5 border-bottom">
        <a href="/" class="d-flex align-items-center text-dark text-decoration-none">
            <span class="fs-4">Ent Blog Demo</span>
        </a>
    </header>

    <main>
        <div class="row g-5">
            <div class="col-md-12">
                {{- range . }}
                    <h2>{{ .Title }}</h2>
                    <p>
                        {{ .CreatedAt.Format "2006-01-02" }} by {{ .Edges.Author.Name }}
                    </p>
                    <p>
                        {{ .Body }}
                    </p>
                {{- end }}
            </div>

        </div>
    </main>
    <footer class="pt-5 my-5 text-muted border-top">
        <p>
            This is the Ent Blog Demo. It is a simple blog application built with Ent and Go. Get started:
        </p>
        <pre>go get entgo.io/ent</pre>
    </footer>
</div>

<script src="https://cdn.jsdelivr.net/npm/bootstrap@5.3.0-alpha1/dist/js/bootstrap.bundle.min.js"
        integrity="sha384-w76AqPfDkMBDXo30jS1Sgez6pr3x5MlQ1ZAGC+nuZB+EYdgRZgiwxhTBTkF7CXvN"
        crossorigin="anonymous"></script>
</body>
</html>

Here we are using a modified version of the Bootstrap Starter Template as the basis of our UI. Let's highlight the important parts. As you will see below, in our index handler, we will pass this template a slice of Post objects.

Inside the Go-template, whatever we pass to it as data is available as ".", this explains this line, where we use range to iterate over each post:

{{- range . }}

Next, we print the title, creation time and the author name, via the Author edge:

<h2>{{ .Title }}</h2>
<p>
    {{ .CreatedAt.Format "2006-01-02" }} by {{ .Edges.Author.Name }}
</p>

Finally, we print the post body and close the loop.

    <p>
        {{ .Body }}
    </p>
{{- end }}

After defining the template, we need to make it available to our program. We embed this template into our binary using the embed package (docs):

var (  
   //go:embed templates/*   
   resources embed.FS  
   tmpl      = template.Must(template.ParseFS(resources, "templates/*"))  
)

Server code

We continue by defining a type named server and a constructor for it, newServer. This struct will have receiver methods for each HTTP handler we create and binds the Ent client we created at init to the server code.

type server struct {
    client *ent.Client
}

func newServer(client *ent.Client) *server {
    return &server{client: client}
}

Next, let's define the handler for our blog home page. This page should contain a list of all available blog posts:

// index serves the blog home page
func (s *server) index(w http.ResponseWriter, r *http.Request) {
    posts, err := s.client.Post.
        Query().
        WithAuthor().
        All(r.Context())
    if err != nil {
        http.Error(w, err.Error(), http.StatusInternalServerError)
        return
    }
    if err := tmpl.Execute(w, posts); err != nil {
        http.Error(w, err.Error(), http.StatusInternalServerError)
    }
}

Let's zoom in on the Ent code here that is used to retrieve the posts from the database:

// s.client.Post contains methods for interacting with Post entities
s.client.Post.
    // Begin a query.
    Query().
    // Retrieve the entities using the `Author` edge. (a `User` instance)
    WithAuthor().
    // Run the query against the database using the request context.
    All(r.Context())

The router

To manage the routes for our application, let's use go-chi, a popular routing library for Go.

go get -u github.com/go-chi/chi/v5

We define the newRouter function that sets up our router:

// newRouter creates a new router with the blog handlers mounted.
func newRouter(srv *server) chi.Router {
    r := chi.NewRouter()
    r.Use(middleware.Logger)
    r.Use(middleware.Recoverer)
    r.Get("/", srv.index)
    return r
}

In this function, we first instantiate a new chi.Router, then register two middlewares:

middleware.Logger is a basic access logger that prints out some information on every request our server handles.
middleware.Recoverer recovers from when our handlers panic, preventing a case where our entire server will crash because of an application error.

Finally, we register the index function of the server struct to handle GET requests to the / path of our server.

A unit test

Before wiring everything together, let's write a simple unit test to check that our code works as expected.

To simplify our tests we will install the SQLite driver for Go which allows us to use an in-memory database:

go get -u github.com/mattn/go-sqlite3

Next, we install testify, a utility library that is commonly used for writing assertions in tests.

go get github.com/stretchr/testify

With these dependencies installed, create a new file named main_test.go:

package main

import (
    "context"
    "io"
    "net/http"
    "net/http/httptest"
    "testing"

    _ "github.com/mattn/go-sqlite3"
    "github.com/rotemtam/ent-blog-example/ent/enttest"
    "github.com/stretchr/testify/require"
)

func TestIndex(t *testing.T) {
    // Initialize an Ent client that uses an in memory SQLite db.
    client := enttest.Open(t, "sqlite3", "file:ent?mode=memory&cache=shared&_fk=1")
    defer client.Close()
    
    // seed the database with our "Hello, world" post and user.
    err := seed(context.Background(), client)
    require.NoError(t, err)

    // Initialize a server and router.
    srv := newServer(client)
    r := newRouter(srv)

    // Create a test server using the `httptest` package.
    ts := httptest.NewServer(r)
    defer ts.Close()

    // Make a GET request to the server root path.
    resp, err := ts.Client().Get(ts.URL)

    // Assert we get a 200 OK status code.
    require.NoError(t, err)
    require.Equal(t, http.StatusOK, resp.StatusCode)

    // Read the response body and assert it contains "Hello, world!"
    body, err := io.ReadAll(resp.Body)
    require.NoError(t, err)
    require.Contains(t, string(body), "Hello, World!")
}

Run the test to verify our server works correctly:

go test ./...

Observe our test passes:

ok      github.com/rotemtam/ent-blog-example    0.719s
?       github.com/rotemtam/ent-blog-example/ent        [no test files]
?       github.com/rotemtam/ent-blog-example/ent/enttest        [no test files]
?       github.com/rotemtam/ent-blog-example/ent/hook   [no test files]
?       github.com/rotemtam/ent-blog-example/ent/migrate        [no test files]
?       github.com/rotemtam/ent-blog-example/ent/post   [no test files]
?       github.com/rotemtam/ent-blog-example/ent/predicate      [no test files]
?       github.com/rotemtam/ent-blog-example/ent/runtime        [no test files]
?       github.com/rotemtam/ent-blog-example/ent/schema [no test files]
?       github.com/rotemtam/ent-blog-example/ent/user   [no test files]

Putting everything together

Finally, let's update our main function to put everything together:

func main() {  
   // Read the connection string to the database from a CLI flag.  
   var dsn string  
   flag.StringVar(&dsn, "dsn", "", "database DSN")  
   flag.Parse()  
  
   // Instantiate the Ent client.  
   client, err := ent.Open("mysql", dsn)  
   if err != nil {  
      log.Fatalf("failed connecting to mysql: %v", err)  
   }  
   defer client.Close()  
  
   ctx := context.Background()  
   // If we don't have any posts yet, seed the database.  
   if !client.Post.Query().ExistX(ctx) {  
      if err := seed(ctx, client); err != nil {  
         log.Fatalf("failed seeding the database: %v", err)  
      }  
   }  
   srv := newServer(client)  
   r := newRouter(srv)  
   log.Fatal(http.ListenAndServe(":8080", r))  
}

We can now run our application and stand amazed at our achievement: a working blog front page!

 go run main.go -dsn "root:pass@tcp(localhost:3306)/test?parseTime=true"

Step 4: Adding content

info

You can follow the changes in this step in this commit.

No content management system would be complete without the ability, well, to manage content. Let's demonstrate how we can add support for publishing new posts on our blog.

Let's start by creating the backend handler:

// add creates a new blog post.
func (s *server) add(w http.ResponseWriter, r *http.Request) {
    author, err := s.client.User.Query().Only(r.Context())
    if err != nil {
        http.Error(w, err.Error(), http.StatusInternalServerError)
        return
    }
    if err := s.client.Post.Create().
        SetTitle(r.FormValue("title")).
        SetBody(r.FormValue("body")).
        SetAuthor(author).
        Exec(r.Context()); err != nil {
        http.Error(w, err.Error(), http.StatusInternalServerError)
    }
    http.Redirect(w, r, "/", http.StatusFound)
}

As you can see, the handler currently loads the only user from the users table (since we have yet to create a user management system or login capabilities). Only will fail unless exactly one result is retrieved from the database.

Next, our handler creates a new post, by setting the title and body fields to values retrieved from r.FormValue. This is where Go stores all of the form input passed to an HTTP request.

After creating the handler, we should wire it to our router:

// newRouter creates a new router with the blog handlers mounted.
func newRouter(srv *server) chi.Router {
    r := chi.NewRouter()
    r.Use(middleware.Logger)
    r.Use(middleware.Recoverer)
    r.Get("/", srv.index)
    r.Post("/add", srv.add)
    return r
}

Next, we can add an HTML <form> component that will be used by our user to write their content:

<div class="col-md-12">
    <hr/>
    <h2>Create a new post</h2>
    <form action="/add" method="post">
        <div class="mb-3">
            <label for="title" class="form-label">Title</label>
            <input name="title" type="text" class="form-control" id="title" placeholder="Once upon a time..">
        </div>
        <div class="mb-3">
            <label for="body" class="form-label">Body</label>
            <textarea name="body" class="form-control" id="body" rows="8"></textarea>
        </div>
        <div class="mb-3">
            <button type="submit" class="btn btn-primary mb-3">Post</button>
        </div>
    </form>
</div>

Also, let's add a nice touch, where we display the blog posts from newest to oldest. To do this, modify the index handler to order the posts in a descending order using the created_at column:

posts, err := s.client.Post.
    Query().
    WithAuthor().
    Order(ent.Desc(post.FieldCreatedAt)).
    All(ctx)

Finally, let's add another unit test that verifies the add post flow works as expected:

func TestAdd(t *testing.T) {
    client := enttest.Open(t, "sqlite3", "file:ent?mode=memory&cache=shared&_fk=1")
    defer client.Close()
    err := seed(context.Background(), client)
    require.NoError(t, err)

    srv := newServer(client)
    r := newRouter(srv)

    ts := httptest.NewServer(r)
    defer ts.Close()

    // Post the form.
    resp, err := ts.Client().PostForm(ts.URL+"/add", map[string][]string{
        "title": {"Testing, one, two."},
        "body":  {"This is a test"},
    })
    require.NoError(t, err)
    // We should be redirected to the index page and receive 200 OK.
    require.Equal(t, http.StatusOK, resp.StatusCode)

    body, err := io.ReadAll(resp.Body)
    require.NoError(t, err)
    
    // The home page should contain our new post.
    require.Contains(t, string(body), "This is a test")
}

Let's run the test:

go test ./...

And everything works!

ok      github.com/rotemtam/ent-blog-example    0.493s
?       github.com/rotemtam/ent-blog-example/ent        [no test files]
?       github.com/rotemtam/ent-blog-example/ent/enttest        [no test files]
?       github.com/rotemtam/ent-blog-example/ent/hook   [no test files]
?       github.com/rotemtam/ent-blog-example/ent/migrate        [no test files]
?       github.com/rotemtam/ent-blog-example/ent/post   [no test files]
?       github.com/rotemtam/ent-blog-example/ent/predicate      [no test files]
?       github.com/rotemtam/ent-blog-example/ent/runtime        [no test files]
?       github.com/rotemtam/ent-blog-example/ent/schema [no test files]
?       github.com/rotemtam/ent-blog-example/ent/user   [no test files]

A passing unit test is great, but let's verify our changes visually:

Our form appears - great! After submitting it:

Our new post is displayed. Well done!

Wrapping up

In this post we demonstrated how to build a simple web application with Ent and Go. Our app is definitely bare but it deals with many of the bases that you will need to cover when building an application: defining your data model, managing your database schema, writing server code, defining routes and building a UI.

As things go with introductory content, we only touched the tip of the iceberg of what you can do with Ent, but I hope you got a taste for some of its core features.

For more Ent news and updates:

Subscribe to our Newsletter
Follow us on Twitter
Join us on #ent on the Gophers Slack
Join us on the Ent Discord Server

Quickly visualize your Ent schemas with entviz

January 26, 2023 · 3 min read

Rotem Tamir

TL;DR

To get a public link to a visualization of your Ent schema, run:

go run -mod=mod ariga.io/entviz ./path/to/ent/schema

Visualizing Ent schemas

Ent enables developers to build complex application data models using graph semantics: instead of defining tables, columns, association tables and foreign keys, Ent models are simply defined in terms of Nodes and Edges:

package schema

import (
    "entgo.io/ent"
    "entgo.io/ent/schema/edge"
)

// User schema.
type User struct {
    ent.Schema
}

// Fields of the user.
func (User) Fields() []ent.Field {
    return []ent.Field{
        // ...
    }
}

// Edges of the user.
func (User) Edges() []ent.Edge {
    return []ent.Edge{
        edge.To("pets", Pet.Type),
    }
}

Modeling data this way has many benefits such as being able to easily traverse an application's data graph in an intuitive API, automatically generating GraphQL servers and more.

While Ent can use a Graph database as its storage layer, most Ent users use common relational databases such as MySQL, PostgreSQL or MariaDB for their applications. In these use-cases, developers often ponder, what actual database schema will Ent create from my application's schema?

Whether you're a new Ent user learning the basics of creating Ent schemas or an expert dealing with optimizing the resulting database schema for performance reasons, being able to easily visualize your Ent schema's backing database schema can be very useful.

Introducing the new `entviz`

A year and a half ago we shared an Ent extension named entviz, that extension enabled users to generate simple, local HTML documents containing entity-relationship diagrams describing an application's Ent schema.

Today, we're happy to share a super cool tool by the same name created by Pedro Henrique (crossworth) which is a completely fresh take on the same problem. With (the new) entviz you run a simple Go command:

go run -mod=mod ariga.io/entviz ./path/to/ent/schema

The tool will analyze your Ent schema and create a visualization on the Atlas Playground and create a shareable, public link for you:

Here is a public link to your schema visualization:
        https://gh.atlasgo.cloud/explore/saved/60129542154

In this link you will be able to see your schema visually as an ERD or textually as either a SQL or Atlas HCL document.

Wrapping up

In this blog post we discussed some scenarios where you might find it useful to quickly get a visualization of your Ent application's schema, we then showed how creating such visualizations can be achieved using entviz. If you like the idea, we'd be super happy if you tried it today and gave us feedback!

For more Ent news and updates:

Subscribe to our Newsletter
Follow us on Twitter
Join us on #ent on the Gophers Slack
Join us on the Ent Discord Server :::

Changing a column’s type with zero-downtime using Atlas

December 1, 2022 · 7 min read

Ronen Lubin (ronenlu)

Changing a column's type in a database schema might seem trivial at first glance, but it is actually a risky operation that can cause compatibility issues between the server and the database. In this blogpost, I will explore how developers can perform this type of change without causing downtime to their application.

Recently, while working on a feature for Ariga Cloud, I was required to change the type of an Ent field from an unstructured blob to a structured JSON field. Changing the column type was necessary in order to use JSON Predicates for more efficient queries.

The original schema looked like this on our cloud product’s schema visualization diagram:

tutorial image 1

In our case, we couldn't just copy the data naively to the new column, since the data is not compatible with the new column type (blob data may not be convertible to JSON).

In the past, it was considered acceptable to stop the server, migrate the database schema to the next version, and only then start the server with the new version that is compatible with the new database schema. Today, business requirements often dictate that applications must provide higher availability, leaving engineering teams with the challenge of executing changes like this with zero-downtime.

The common pattern to satisfy this kind of requirement, as defined in "Evolutionary Database Design" by Martin Fowler, is to use a "transition phase".

A transition phase is "a period of time when the database supports both the old access pattern and the new ones simultaneously. This allows older systems time to migrate over to the new structures at their own pace", as illustrated by this diagram:

tutorial image 2 Credit: martinfowler.com

We planned the change in 5 simple steps, all of which are backward-compatible:

Creating a new column named meta_json with the JSON type.
Deploy a version of the application that performs dual-writes. Every new record or update is written to both the new column and the old column, while reads still happen from the old column.
Backfill data from the old column to the new one.
Deploy a version of the application that reads from the new column.
Delete the old column.

Versioned migrations

In our project we are using Ent’s versioned migrations workflow for managing the database schema. Versioned migrations provide teams with granular control on how changes to the application database schema are made. This level of control will be very useful in implementing our plan. If your project uses Automatic Migrations and you would like to follow along, first upgrade your project to use versioned migrations.

note

The same can be done with automatic migrations as well by using the Data Migrations feature, however this post is focusing on versioned migrations

Creating a JSON column with Ent:

First, we will add a new JSON Ent type to the user schema.

types/types.go
type Meta struct {
    CreateTime time.Time `json:"create_time"`
    UpdateTime time.Time `json:"update_time"`
}

ent/schema/user.go
func (User) Fields() []ent.Field {
    return []ent.Field{
        field.Bytes("meta"),
        field.JSON("meta_json", &types.Meta{}).Optional(),
    }
}

Next, we run codegen to update the application schema:

go generate ./...

Next, we run our automatic migration planning script that generates a set of migration files containing the necessary SQL statements to migrate the database to the newest version.

go run -mod=mod ent/migrate/main.go add_json_meta_column

The resulted migration file describing the change:

-- modify "users" table
ALTER TABLE `users` ADD COLUMN `meta_json` json NULL;

Now, we will apply the created migration file using Atlas:

atlas migrate apply \
  --dir "file://ent/migrate/migrations"
  --url mysql://root:pass@localhost:3306/ent

As a result, we have the following schema in our database:

tutorial image 3

Start writing to both columns

After generating the JSON type, we will start writing to the new column:

-       err := client.User.Create().
-           SetMeta(input.Meta).
-           Exec(ctx)
+       var meta types.Meta
+       if err := json.Unmarshal(input.Meta, &meta); err != nil {
+           return nil, err
+       }
+       err := client.User.Create().
+           SetMetaJSON(&meta).
+           Exec(ctx)

To ensure that values written to the new column meta_json are replicated to the old column, we can utilize Ent’s Schema Hooks feature. This adds blank import ent/runtime in your main to register the hook and avoid circular import:

// Hooks of the User.
func (User) Hooks() []ent.Hook {
    return []ent.Hook{
        hook.On(
            func(next ent.Mutator) ent.Mutator {
                return hook.UserFunc(func(ctx context.Context, m *gen.UserMutation) (ent.Value, error) {
                    meta, ok := m.MetaJSON()
                    if !ok {
                        return next.Mutate(ctx, m)
                    }
                    if b, err := json.Marshal(meta); err != nil {
                        return nil, err
                    }
                    m.SetMeta(b)
                    return next.Mutate(ctx, m)
                })
            },
            ent.OpCreate,
        ),
    }
}

After ensuring writes to both fields we can safely deploy to production.

Backfill values from old column

Now in our production database we have two columns: one storing the meta object as a blob and another storing it as a JSON. The second column may have null values since the JSON column was only added recently, therefore we need to backfill it with the old column’s values.

To do so, we manually create a SQL migration file that will fill values in the new JSON column from the old blob column.

note

You can also write Go code that generates this data migration file by using the WriteDriver.

Create a new empty migration file:

atlas migrate new --dir file://ent/migrate/migrations

For every row in the users table with a null JSON value (i.e: rows added before the creation of the new column), we try to parse the meta object into a valid JSON. If we succeed, we will fill the meta_json column with the resulting value, otherwise we will mark it empty.

Our next step is to edit the migration file:

UPDATE users
SET meta_json = CASE
        -- when meta is valid json stores it as is.
        WHEN JSON_VALID(cast(meta as char)) = 1 THEN cast(cast(meta as char) as json)
        -- if meta is not valid json, store it as an empty object.
        ELSE JSON_SET('{}')
    END
WHERE meta_json is null;

Rehash the migration directory after changing a migration file:

atlas migrate hash --dir "file://ent/mirate/migrations"

We can test the migration file by executing all the previous migration files on a local database, seed it with temporary data, and apply the last migration to ensure our migration file works as expected.

After testing we apply the migration file:

atlas migrate apply \
  --dir "file://ent/migrate/migrations"
  --url mysql://root:pass@localhost:3306/ent 

Now, we will deploy to production once more.

Redirect reads to the new column and delete old blob column

Now that we have values in the meta_json column, we can change the reads from the old field to the new field.

Instead of decoding the data from user.meta on each read, just use the meta_json field:

-       var meta types.Meta
-       if err = json.Unmarshal(user.Meta, &meta); err != nil {
-           return nil, err
-       }
-       if meta.CreateTime.Before(time.Unix(0, 0)) {
-           return nil, errors.New("invalid create time")
-       }
+       if user.MetaJSON.CreateTime.Before(time.Unix(0, 0)) {
+           return nil, errors.New("invalid create time")
+       }

After redirecting the reads we will deploy the changes to production.

Delete the old column

It is now possible to remove the field describing the old column from the Ent schema, since as we are no longer using it.

func (User) Fields() []ent.Field {
    return []ent.Field{
-     field.Bytes("meta"),
      field.JSON("meta_json", &types.Meta{}).Optional(),
    }
}

Generate the Ent schema again with the Drop Column feature enabled.

go run -mod=mod ent/migrate/main.go drop_user_meta_column

Now that we have properly created our new field, redirected writes, backfilled it and dropped the old column - we are ready for the final deployment. All that’s left is to merge our code into version control and deploy to production!

Wrapping up

In this post, we discussed how to change a column type in the production database with zero downtime using Atlas’s version migrations integrated with Ent.

Have questions? Need help with getting started? Feel free to join our Ent Discord Server.

For more Ent news and updates:

Subscribe to our Newsletter
Follow us on Twitter
Join us on #ent on the Gophers Slack
Join us on the Ent Discord Server

Appending values to JSON fields with Ent

October 10, 2022 · 6 min read

Rotem Tamir

TL;DR

Most relational databases support columns with unstructured JSON values.
Ent has great support for working with JSON values in relational databases.
How to append values to a JSON array in an atomic way.
Ent recently added support for atomically appending values to fields in JSON values.

JSON values in SQL databases

Despite being known mostly for storing structured tabular data, virtually all modern relational databases support JSON columns for storing unstructured data in table columns. For example, in MySQL you can create a table such as:

CREATE TABLE t1 (jdoc JSON);

In this column, users may store JSON objects of an arbitrary schema:

INSERT INTO t1 VALUES('{"key1": "value1", "key2": "value2"}');

Because JSON documents can always be expressed as strings, they can be stored in regular VARCHAR or TEXT columns. However, when a column is declared with the JSON type, the database enforces the correctness of the JSON syntax. For example, if we try to write an incorrect JSON document to this MySQL table:

INSERT INTO t1 VALUES('[1, 2,');

We will receive this error:

ERROR 3140 (22032) at line 2: Invalid JSON text:
"Invalid value." at position 6 in value (or column) '[1, 2,'.

In addition, values stored inside JSON documents may be accessed in SELECT statements using JSON Path expressions, as well as used in predicates (WHERE clauses) to filter data:

select c->'$.hello' as greeting from t where c->'$.hello' = 'world';;

To get:

+--------------+
| greeting     |
+--------------+
| "world"      |
+--------------+
1 row in set (0.00 sec)

JSON values in Ent

With Ent, users may define JSON fields in schemas using field.JSON by passing the desired field name as well as the backing Go type. For example:

type Tag struct {
    Name    string    `json:"name"`
    Created time.Time `json:"created"`
}

func (User) Fields() []ent.Field {
    return []ent.Field{
        field.JSON("tags", []Tag{}),
    }
}

Ent provides a convenient API for reading and writing values to JSON columns, as well as applying predicates (using sqljson):

func TestEntJSON(t *testing.T) {
    client := enttest.Open(t, "sqlite3", "file:ent?mode=memory&cache=shared&_fk=1")
    ctx := context.Background()
    // Insert a user with two comments.
    client.User.Create().
        SetTags([]schema.Tag{
            {Name: "hello", Created: time.Now()},
            {Name: "goodbye", Created: time.Now()},
        }).
        SaveX(ctx)

    // Count how many users have more than zero tags.
    count := client.User.Query().
        Where(func(s *sql.Selector) {
            s.Where(
                sqljson.LenGT(user.FieldTags, 0),
            )
        }).
        CountX(ctx)
    fmt.Printf("count: %d", count)
    // Prints: count: 1
}

Appending values to fields in JSON columns

In many cases, it is useful to append a value to a list in a JSON column. Preferably, appends are implemented in a way that is atomic, meaning, without needing to read the current value and writing the entire new value. The reason for this is that if two calls try to append the value concurrently, both will read the same current value from the database, and write their own updated version roughly at the same time. Unless optimistic locking is used, both writes will succeed, but the final result will only include one of the new values.

To overcome this race condition, we can let the database take care of the synchronization between both calls by using a clever UPDATE query. The intuition for this solution is similar to how counters are incremented in many applications. Instead of running:

SELECT points from leaderboard where user='rotemtam' 

Reading the result (lets say its 1000), incrementing the value in process (1000+1=1001) and writing the new sum verbatim:

UPDATE leaderboard SET points=1001 where user='rotemtam'

Developers can use a query such as:

UPDATE leaderboard SET points=points+1 where user='rotemtam'

To avoid the need to synchronize writes using optimistic locking or some other mechanism, let's see how we can similarly leverage the database's capability to perform them concurrently in a safe manner.

There are two things to note as we are constructing this query. First, the syntax for working with JSON values varies a bit between different database vendors, as you will see in the examples below. Second, a query for appending a value to a list in a JSON document needs to handle at least two edge cases:

The field we want to append to doesn't exist yet in the JSON document.
The field exists but is set to JSON null.

Here is what such a query might look like for appending a value new_val to a field named a in a column c for table t in different dialects:

MySQL
PostgeSQL
SQLite

UPDATE `t` SET `c` = CASE
WHEN
    (JSON_TYPE(JSON_EXTRACT(`c`, '$.a')) IS NULL
    OR JSON_TYPE(JSON_EXTRACT(`c`, '$.a')) = 'NULL')
THEN
    JSON_SET(`c`, '$.a', JSON_ARRAY('new_val'))
ELSE
    JSON_ARRAY_APPEND(`c`, '$.a', 'new_val')
END

UPDATE "t" SET "c" = CASE
WHEN
    (("c"->'a')::jsonb IS NULL
    OR ("c"->'a')::jsonb = 'null'::jsonb)
THEN
    jsonb_set("c", '{a}', 'new_val', true)
ELSE
    jsonb_set("c", '{a}', "c"->'a' || 'new_val', true)
END

UPDATE `t` SET `c` = CASE
WHEN
    (JSON_TYPE(`c`, '$') IS NULL
    OR JSON_TYPE(`c`, '$') = 'null')
THEN
    JSON_ARRAY(?)
ELSE
    JSON_INSERT(`c`, '$[#]', ?)
END

Appending values to JSON fields with Ent

Ent recently added support for atomically appending values to fields in JSON columns. Let's see how it can be used.

If the backing type of the JSON field is a slice, such as:

// Fields of the User.
func (User) Fields() []ent.Field {
    return []ent.Field{
        field.JSON("tags", []string{}),
    }
}

Ent will generate a method AppendTags on the update mutation builders. You can use them like so:

func TestAppend(t *testing.T) {
    client := enttest.Open(t, "sqlite3", "file:ent?mode=memory&cache=shared&_fk=1")
    ctx := context.Background()
    // Insert a user with two tags.
    u := client.User.Create().
        SetTags([]string{"hello", "world"}).
        SaveX(ctx)

    u.Update().AppendTags([]string{"goodbye"}).ExecX(ctx)

    again := client.User.GetX(ctx, u.ID)
    fmt.Println(again.Tags)
    // Prints: [hello world goodbye]
}

If the backing type of the JSON field is a struct containing a list, such as:

type Meta struct {
    Tags []string `json:"tags"'`
}

// Fields of the User.
func (User) Fields() []ent.Field {
    return []ent.Field{
        field.JSON("meta", &Meta{}),
    }
}

You can use the custom sql/modifier option to have Ent generate the Modify method which you can use this way:

func TestAppendSubfield(t *testing.T) {
    client := enttest.Open(t, "sqlite3", "file:ent?mode=memory&cache=shared&_fk=1")
    ctx := context.Background()
    // Insert a user with two tags.
    u := client.User.Create().
        SetMeta(&schema.Meta{
            Tags: []string{"hello", "world"},
        }).
        SaveX(ctx)

    u.Update().
        Modify(func(u *sql.UpdateBuilder) {
            sqljson.Append(u, user.FieldMeta, []string{"goodbye"}, sqljson.Path("tags"))
        }).
        ExecX(ctx)

    again := client.User.GetX(ctx, u.ID)
    fmt.Println(again.Meta.Tags)
    // Prints: [hello world goodbye]
}

Wrapping up

In this post we discussed JSON fields in SQL and Ent in general. Next, we discussed how appending values to a JSON field can be done atomically in popular SQL databases. Finally, we showed how to do this using Ent. Do you think Remove/Slice operations are necessary? Let us know what you think!

For more Ent news and updates:

Subscribe to our Newsletter
Follow us on Twitter
Join us on #ent on the Gophers Slack
Join us on the Ent Discord Server

Continuous Integration for Ent Projects

September 6, 2022 · 7 min read

Rotem Tamir

To ensure the quality of their software, teams often apply Continuous Integration workflows, commonly known as CI. With CI, teams continuously run a suite of automated verifications against every change to the code-base. During CI, teams may run many kinds of verifications:

Compilation or build of the most recent version to make sure it isn't broken.
Linting to enforce any accepted code-style standards.
Unit tests that verify individual components work as expected and that changes to the codebase do not cause regressions in other areas.
Security scans to make sure no known vulnerabilities are introduced to the codebase.
And much more!

From our discussions with the Ent community, we have learned that many teams using Ent already use CI and would like to enforce some Ent-specific verifications into their workflows.

To support the community with this effort, we added a new guide to the docs which documents common best practices to verify in CI and introduces ent/contrib/ci: a GitHub Action we maintain that codifies them.

In this post, I want to share some of our initial suggestions on how you might incorporate CI to you Ent project. Towards the end of this post I will share insights into projects we are working on and would like to get the community's feedback for.

Verify all generated files are checked in

Ent heavily relies on code generation. In our experience, generated code should always be checked into source control. This is done for two reasons:

If generated code is checked into source control, it can be read along with the main application code. Having generated code present when the code is reviewed or when a repository is browsed is essential to get a complete picture of how things work.
Differences in development environments between team members can easily be spotted and remedied. This further reduces the chance of "it works on my machine" type issues since everyone is running the same code.

If you're using GitHub for source control, it's easy to verify that all generated files are checked in with the ent/contrib/ci GitHub Action. Otherwise, we supply a simple bash script that you can integrate in your existing CI flow.

GitHub Action
Bash

Simply add a file named .github/workflows/ent-ci.yaml in your repository:

name: EntCI
on:
  push:
  # Run whenever code is changed in the master.
    branches:
      - master
  # Run on PRs where something changed under the `ent/` directory.
  pull_request:
    paths:
      - 'ent/*'
jobs:
  ent:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v3.0.1
      - uses: actions/setup-go@v3
        with:
          go-version: 1.18
      - uses: ent/contrib/ci@master

go generate ./...
status=$(git status --porcelain)
if [ -n "$status" ]; then
    echo "you need to run 'go generate ./...' and commit the changes"
    echo "$status"
    exit 1
fi

Lint migration files

Changes to your project's Ent schema almost always result in a modification of your database. If you are using Versioned Migrations to manage changes to your database schema, you can run migration linting as part of your continuous integration flow. This is done for multiple reasons:

Linting replays your migration directory on a database container to make sure all SQL statements are valid and in the correct order.
Migration directory integrity is enforced - ensuring that history wasn't accidentally changed and that migrations that are planned in parallel are unified to a clean linear history.
Destructive changes are detected, notifying you of any potential data loss that may be caused by your migrations way before they reach your production database.
Linting detects data-dependent changes that may fail upon deployment and require more careful review from your side.

If you're using GitHub, you can use the Official Atlas Action to run migration linting during CI.

Add .github/workflows/atlas-ci.yaml to your repo with the following contents:

MySQL
MariaDB
PostgreSQL

name: Atlas CI
on:
  # Run whenever code is changed in the master branch,
  # change this to your root branch.
  push:
    branches:
      - master
  # Run on PRs where something changed under the `ent/migrate/migrations/` directory.
  pull_request:
    paths:
      - 'ent/migrate/migrations/*'
jobs:
  lint:
    services:
      # Spin up a mysql:8.0.29 container to be used as the dev-database for analysis.
      mysql:
        image: mysql:8.0.29
        env:
          MYSQL_ROOT_PASSWORD: pass
          MYSQL_DATABASE: test
        ports:
          - 3306:3306
        options: >-
          --health-cmd "mysqladmin ping -ppass"
          --health-interval 10s
          --health-start-period 10s
          --health-timeout 5s
          --health-retries 10
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v3.0.1
        with:
          fetch-depth: 0 # Mandatory unless "latest" is set below.
      - uses: ariga/atlas-action@v0
        with:
          dir: ent/migrate/migrations
          dir-format: golang-migrate # Or: atlas, goose, dbmate
          dev-url: mysql://root:pass@localhost:3306/test

name: Atlas CI
on:
  # Run whenever code is changed in the master branch,
  # change this to your root branch.
  push:
    branches:
      - master
  # Run on PRs where something changed under the `ent/migrate/migrations/` directory.
  pull_request:
    paths:
      - 'ent/migrate/migrations/*'
jobs:
  lint:
    services:
      # Spin up a maria:10.7 container to be used as the dev-database for analysis.
      maria:
        image: mariadb:10.7
        env:
          MYSQL_DATABASE: test
          MYSQL_ROOT_PASSWORD: pass
        ports:
          - 3306:3306
        options: >-
          --health-cmd "mysqladmin ping -ppass"
          --health-interval 10s
          --health-start-period 10s
          --health-timeout 5s
          --health-retries 10
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v3.0.1
        with:
          fetch-depth: 0 # Mandatory unless "latest" is set below.
      - uses: ariga/atlas-action@v0
        with:
          dir: ent/migrate/migrations
          dir-format: golang-migrate # Or: atlas, goose, dbmate
          dev-url: maria://root:pass@localhost:3306/test

name: Atlas CI
on:
  # Run whenever code is changed in the master branch,
  # change this to your root branch.
  push:
    branches:
      - master
  # Run on PRs where something changed under the `ent/migrate/migrations/` directory.
  pull_request:
    paths:
      - 'ent/migrate/migrations/*'
jobs:
  lint:
    services:
      # Spin up a postgres:10 container to be used as the dev-database for analysis.
      postgres:
        image: postgres:10
        env:
          POSTGRES_DB: test
          POSTGRES_PASSWORD: pass
        ports:
          - 5432:5432
        options: >-
          --health-cmd pg_isready
          --health-interval 10s
          --health-timeout 5s
          --health-retries 5
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v3.0.1
        with:
          fetch-depth: 0 # Mandatory unless "latest" is set below.
      - uses: ariga/atlas-action@v0
        with:
          dir: ent/migrate/migrations
          dir-format: golang-migrate # Or: atlas, goose, dbmate
          dev-url: postgres://postgres:pass@localhost:5432/test?sslmode=disable

Notice that running atlas migrate lint requires a clean dev-database which is provided by the services block in the example code above.

What's next for Ent CI

To add to this modest beginning, I want to share some features that we are experimenting with at Ariga with hope to get the community's feedback on them.

Linting for Online Migrations - many Ent projects use the automatic schema migration mechanism that is available in Ent (using ent.Schema.Create when applications start). Assuming a project's source code is managed in a version control system (such as Git), we compare the schema in the mainline branch (master/main/etc.) with the one in the current feature branch and use Atlas's schema diff capability to calculate the SQL statements that are going to be run against the database. We can then use Atlas's linting capability to provide insights about possible dangers the arise from the proposed change.
Change visualization - to assist reviewers in understanding the impact of changes proposed in a specific pull request we generate a visual diff (using an ERD similar to entviz) reflect the changes to a project's schema.
Schema Linting - using the official go/analysis package to create linters that analyze an Ent schema's Go code and enforce policies (such as naming or indexing conventions) on the schema definition level.

Wrapping up

In this post, we presented the concept of CI and discussed ways in which it can be practiced for Ent projects. Next, we presented CI checks we are experimenting with internally. If you would like to see these checks become a part of Ent or have other ideas for providing CI tools for Ent, ping us on the Ent Discord Server.

For more Ent news and updates:

Subscribe to our Newsletter
Follow us on Twitter
Join us on #ent on the Gophers Slack
Join us on the Ent Discord Server

Versioned Migrations Management and Migration Directory Integrity

May 9, 2022 · 9 min read

Jannik Clausen (MasseElch)

Five weeks ago we released a long awaited feature for managing database changes in Ent: Versioned Migrations. In the announcement blog post we gave a brief introduction into both the declarative and change-based approach to keep database schemas in sync with the consuming applications, as well as their drawbacks and why Atlas' (Ents underlying migration engine) attempt of bringing the best of both worlds into one workflow is worth a try. We call it Versioned Migration Authoring and if you haven't read it, now is a good time!

With versioned migration authoring, the resulting migration files are still "change-based", but have been safely planned by the Atlas engine. This means that you can still use your favorite migration management tool, like Flyway, Liquibase, golang-migrate/migrate, or pressly/goose when developing services with Ent.

In this blog post I want to show you another new feature of the Atlas project we call the Migration Directory Integrity File, which is now supported in Ent, and how you can use it with any of the migration management tools you are already used to and like.

The Problem

When using versioned migrations, developers need to be careful of doing the following in order to not break the database:

Retroactively changing migrations that have already run.
Accidentally changing the order in which migrations are organized.
Checking in semantically incorrect SQL scripts. Theoretically, code review should guard teams from merging migrations with these issues. In my experience, however, there are many kinds of errors that can slip the human eye, making this approach error-prone. Therefore, an automated way of preventing these errors is much safer.

The first issue (changing history) is addressed by most management tools by saving a hash of the applied migration file to the managed database and comparing it with the files. If they don't match, the migration can be aborted. However, this happens in a very late stage in the development cycle (during deployment), and it could save both time and resources if this can be detected earlier.

For the second (and third) issue, consider the following scenario:

atlas-versioned-migrations-no-conflict

This diagram shows two possible errors that go undetected. The first one being the order of the migration files.

Team A and Team B both branch a feature roughly at the same time. Team B generates a migration file with a version timestamp x and continues to work on the feature. Team A generates a migration file at a later point in time and therefore has the migration version timestamp x+1. Team A finishes the feature and merges it into master, possibly automatically deploying it in production with the migration version x+1 applied. No problem so far.

Now, Team B merges its feature with the migration version x, which predates the already applied version x+1. If the code review process does not detect this, the migration file lands in production, and it now depends on the specific migration management tool to decide what happens.

Most tools have their own solution to that problem, pressly/goose for example takes an approach they call hybrid versioning. Before I introduce you to Atlas' (Ent's) unique way of handling this problem, let's have a quick look at the third issue:

If both Team A and Team B develop a feature where they need new tables or columns, and they give them the same name, (e.g. users) they could both generate a statement to create that table. While the team that merges first will have a successful migration, the second team's migration will fail since the table or column already exists.

The Solution

Atlas has a unique way of handling the above problems. The goal is to raise awareness about the issues as soon as possible. In our opinion, the best place to do so is in version control and continuous integration (CI) parts of a product. Atlas' solution to this is the introduction of a new file we call the Migration Directory Integrity File. It is simply another file named atlas.sum that is stored together with the migration files and contains some metadata about the migration directory. Its format is inspired by the go.sum file of a Go module, and it would look similar to this:

h1:KRFsSi68ZOarsQAJZ1mfSiMSkIOZlMq4RzyF//Pwf8A=
20220318104614_team_A.sql h1:EGknG5Y6GQYrc4W8e/r3S61Aqx2p+NmQyVz/2m8ZNwA=

The atlas.sum file contains a sum of the whole directory as its first entry, and a checksum for each of the migration files (implemented by a reverse, one branch merkle hash tree). Let's see how we can use this file to detect the cases above in version control and CI. Our goal is to raise awareness that both teams added migrations and that they most likely have to be checked before proceeding the merge.

note

To follow along, run the following commands to quickly have an example to work with. They will:

Create a Go module and download all needed dependencies
Create a very basic User schema
Enable the versioned migrations feature
Run the codegen
Start a MySQL docker container to use (remove with docker stop atlas-sum)

mkdir ent-sum-file
cd ent-sum-file
go mod init ent-sum-file
go install entgo.io/ent/cmd/ent@master
go run entgo.io/ent/cmd/ent new User
sed -i -E 's|^//go(.*)$|//go\1 --feature sql/versioned-migration|' ent/generate.go
go generate ./...
docker run --rm --name atlas-sum --detach --env MYSQL_ROOT_PASSWORD=pass --env MYSQL_DATABASE=ent -p 3306:3306 mysql

The first step is to tell the migration engine to create and manage the atlas.sum by using the schema.WithSumFile() option. The below example uses an instantiated Ent client to generate new migration files:

package main

import (
    "context"
    "log"
    "os"

    "ent-sum-file/ent"

    "ariga.io/atlas/sql/migrate"
    "entgo.io/ent/dialect/sql/schema"
    _ "github.com/go-sql-driver/mysql"
)

func main() {
    client, err := ent.Open("mysql", "root:pass@tcp(localhost:3306)/ent")
    if err != nil {
        log.Fatalf("failed connecting to mysql: %v", err)
    }
    defer client.Close()
    ctx := context.Background()
    // Create a local migration directory.
    dir, err := migrate.NewLocalDir("migrations")
    if err != nil {
        log.Fatalf("failed creating atlas migration directory: %v", err)
    }
    // Write migration diff.
    err = client.Schema.NamedDiff(ctx, os.Args[1], schema.WithDir(dir), schema.WithSumFile())
    if err != nil {
        log.Fatalf("failed creating schema resources: %v", err)
    }
}

After creating a migrations directory and running the above commands you should see golang-migrate/migrate compatible migration files and in addition, the atlas.sum file with the following contents:

mkdir migrations
go run -mod=mod main.go initial

20220504114411_initial.up.sql
-- create "users" table
CREATE TABLE `users` (`id` bigint NOT NULL AUTO_INCREMENT, PRIMARY KEY (`id`)) CHARSET utf8mb4 COLLATE utf8mb4_bin;

20220504114411_initial.down.sql
-- reverse: create "users" table
DROP TABLE `users`;

atlas.sum
h1:SxbWjP6gufiBpBjOVtFXgXy7q3pq1X11XYUxvT4ErxM=
20220504114411_initial.down.sql h1:OllnelRaqecTrPbd2YpDbBEymCpY/l6ihbyd/tVDgeY=
20220504114411_initial.up.sql h1:o/6yOczGSNYQLlvALEU9lK2/L6/ws65FrHJkEk/tjBk=

As you can see the atlas.sum file contains one entry for each migration file generated. With the atlas.sum generation file enabled, both Team A and Team B will have such a file once they generate migrations for a schema change. Now the version control will raise a merge conflict once the second Team attempts to merge their feature.

atlas-versioned-migrations-no-conflict

note

In the following steps we invoke the Atlas CLI by calling go run -mod=mod ariga.io/atlas/cmd/atlas, but you can also install the CLI globally (and then simply invoke it by calling atlas) to your system by following the installation instructions here.

You can check at any time, if your atlas.sum file is in sync with the migration directory with the following command ( which should not output any errors now):

go run -mod=mod ariga.io/atlas/cmd/atlas migrate validate

However, if you happen to make a manual change to your migration files, like adding a new SQL statement, editing an existing one or even creating a completely new file, the atlas.sum file is no longer in sync with the migration directory's contents. Attempting to generate new migration files for a schema change will now be blocked by the Atlas migration engine. Try it out by creating a new empty migration file and run the main.go once again:

go run -mod=mod ariga.io/atlas/cmd/atlas migrate new migrations/manual_version.sql --format golang-migrate
go run -mod=mod main.go initial
# 2022/05/04 15:08:09 failed creating schema resources: validating migration directory: checksum mismatch
# exit status 1

The atlas migrate validate command will tell you the same:

go run -mod=mod ariga.io/atlas/cmd/atlas migrate validate
# Error: checksum mismatch
# 
# You have a checksum error in your migration directory.
# This happens if you manually create or edit a migration file.
# Please check your migration files and run
# 
# 'atlas migrate hash --force'
# 
# to re-hash the contents and resolve the error.
# 
# exit status 1

In order to get the atlas.sum file back in sync with the migration directory, we can once again use the Atlas CLI:

go run -mod=mod ariga.io/atlas/cmd/atlas migrate hash --force

As a safety measure, the Atlas CLI does not operate on a migration directory that is not in sync with its atlas.sum file. Therefore, you need to add the --force flag to the command.

For cases where a developer forgets to update the atlas.sum file after making a manual change, you can add an atlas migrate validate call to your CI. We are actively working on a GitHub action and CI solution, that does this (among and other things) for you out-of-the-box.

Wrapping Up

In this post, we gave a brief introduction to common sources of schema migration when working with change based SQL files and introduced a solution based on the Atlas project to make migrations more safe.

Have questions? Need help with getting started? Feel free to join our Ent Discord Server.

For more Ent news and updates:

Subscribe to our Newsletter
Follow us on Twitter
Join us on #ent on the Gophers Slack
Join us on the Ent Discord Server

How to implement the Twitter edit button with Ent

April 7, 2022 · 6 min read

Amit Shani

Twitter's "Edit Button" feature has reached the headlines with Elon Musk's poll tweet asking whether users want the feature or not.

Without a doubt, this is one of Twitter's most requested features.

As a software developer, I immediately began to think about how I would implement this myself. The tracking/auditing problem is very common in many applications. If you have an entity (say, a Tweet) and you want to track changes to one of its fields (say, the content field), there are many common solutions. Some databases even have proprietary solutions like Microsoft's change tracking and MariaDB's System Versioned Tables. However, in most use-cases you'd have to "stitch" it yourself. Luckily, Ent provides a modular extensions system that lets you plug in features like this with just a few lines of code.

Twitter+Edit Button

if only

Introduction to Ent

Ent is an Entity framework for Go that makes developing large applications a breeze. Ent comes pre-packed with awesome features out of the box, such as:

Type-safe generated CRUD API
Complex Graph traversals (SQL joins made easy)
Paging
Privacy
Safe DB migrations.

With Ent's code generation engine and advanced extensions system, you can easily modularize your Ent's client with advanced features that are usually time-consuming to implement manually. For example:

Generate REST, gRPC, and GraphQL server.
Caching
Monitoring with sqlcommenter

Enthistory

enthistory is an extension that we started developing when we wanted to add an "Activity & History" panel to one of our web services. The panel's role is to show who changed what and when (aka auditing). In Atlas, a tool for managing databases using declarative HCL files, we have an entity called "schema" which is essentially a large text blob. Any change to the schema is logged and can later be viewed in the "Activity & History" panel.

Activity and History

The "Activity & History" screen in Atlas

This feature is very common and can be found in many apps, such as Google docs, GitHub PRs, and Facebook posts, but is unfortunately missing in the very popular and beloved Twitter.

Over 3 million people voted in favor of adding the "edit button" to Twitter, so let me show you how Twitter can make their users happy without breaking a sweat!

With Enthistory, all you have to do is simply annotate your Ent schema like so:

func (Tweet) Fields() []ent.Field {
    return []ent.Field{
        field.String("content").
            Annotations(enthistory.TrackField()),
        field.Time("created").
            Default(time.Now),
    }
}

Enthistory hooks into your Ent client to ensure that every CRUD operation to "Tweet" is recorded into the "tweets_history" table, with no code modifications and provides an API to consume these records:

// Creating a new Tweet doesn't change. enthistory automatically modifies
// your transaction on the fly to record this event in the history table
client.Tweet.Create().SetContent("hello world!").SaveX(ctx)

// Querying history changes is as easy as querying any other entity's edge.
t, _ := client.Tweet.Get(ctx, id)
hs := client.Tweet.QueryHistory(t).WithChanges().AllX(ctx)

Let's see what you'd have to do if you weren't using Enthistory: For example, consider an app similar to Twitter. It has a table called "tweets" and one of its columns is the tweet content.

id	content	created_at	author_id
1	Hello Twitter!	2022-04-06T13:45:34+00:00	123
2	Hello Gophers!	2022-04-06T14:03:54+00:00	456

Now, assume that we want to allow users to edit the content, and simultaneously display the changes in the frontend. There are several common approaches for solving this problem, each with its own pros and cons, but we will dive into those in another technical post. For now, a possible solution for this is to create a table "tweets_history" which records the changes of a tweet:

id	tweet_id	timestamp	event	content
1	1	2022-04-06T12:30:00+00:00	CREATED	hello world!
2	2	2022-04-06T13:45:34+00:00	UPDATED	hello Twitter!

With a table similar to the one above, we can record changes to the original tweet "1" and if requested, we can show that it was originally tweeted at 12:30:00 with the content "hello world!" and was modified at 13:45:34 to "hello Twitter!".

To implement this, we will have to change every UPDATE statement for "tweets" to include an INSERT to "tweets_history". For correctness, we will need to wrap both statements in a transaction to avoid corrupting the history. in case the first statement succeeds but the subsequent one fails. We'd also need to make sure every INSERT to "tweets" is coupled with an INSERT to "tweets_history"

# INSERT is logged as "CREATE" history event
- INSERT INTO tweets (`content`) VALUES ('Hello World!');
+BEGIN;
+INSERT INTO tweets (`content`) VALUES ('Hello World!');
+INSERT INTO tweets_history (`content`, `timestamp`, `record_id`, `event`)
+VALUES ('Hello World!', NOW(), 1, 'CREATE');
+COMMIT;

# UPDATE is logged as "UPDATE" history event
- UPDATE tweets SET `content` = 'Hello World!' WHERE id = 1;
+BEGIN;
+UPDATE tweets SET `content` = 'Hello World!' WHERE id = 1;
+INSERT INTO tweets_history (`content`, `timestamp`, `record_id`, `event`)
+VALUES ('Hello World!', NOW(), 1, 'UPDATE');
+COMMIT;

This method is nice but you'd have to create another table for different entities ("comment_history", "settings_history"). To prevent that, Enthistory creates a single "history" and a single "changes" table and records all the tracked fields there. It also supports many type of fields without needing to add more columns.

Pre release

Enthistory is still in early design stages and is being internally tested. Therefore, we haven't released it to open-source yet, though we plan to do so very soon. If you want to play with a pre-release version of Enthistory, I wrote a simple React application with GraphQL+Enthistory to demonstrate how a tweet edit could look like. You can check it out here. Please feel free to share your feedback.

Wrapping up

We saw how Ent's modular extension system lets you streamline advanced features as if they were just a package install away. Developing your own extension is fun, easy and educating! I invite you to try it yourself! In the future, Enthistory will be used to track changes to Edges (aka foreign-keyed tables), integrate with OpenAPI and GraphQL extensions, and provide more methods for its underlying implementation.

For more Ent news and updates:

Subscribe to our Newsletter
Follow us on Twitter
Join us on #ent on the Gophers Slack
Join us on the Ent Discord Server

Announcing preview support for TiDB

March 17, 2022 · 4 min read

Amit Shani

We previously announced Ent's new migration engine - Atlas. Using Atlas, it has become easier than ever to add support for new databases to Ent. Today, I am happy to announce that preview support for TiDB is now available, using the latest version of Ent with Atlas enabled.

Ent can be used to access data in many types of databases, both graph-oriented and relational. Most commonly, users have been using standard open-source relational databases such as MySQL, MariaDB, and PostgreSQL. As teams building Ent-based applications become more successful and need to deal with traffic on larger scales, these single-node databases often become the bottleneck for scaling out. For this reason, many members of the Ent community have requested support for NewSQL databases such as TiDB.

TiDB

TiDB is an open-source NewSQL database. It provides many features that traditional databases don't, such as:

Horizontal scaling - for many years software architects needed to choose between the familiarity and guarantees that relational databases provide and the scaling-out capability of NoSQL databases (such as MongoDB or Cassandra). TiDB supports horizontal scaling while maintaining good compatibility with MySQL features.
HTAP (Hybrid transactional/analytical processing) - In addition, databases are traditionally divided into analytical (OLAP) and transactional (OLTP) databases. TiDB breaks this dichotomy by enabling both analytics and transactional workloads on the same database.
Pre-packed monitoring w/ Prometheus+Grafana - TiDB is built on Cloud-native paradigms from the ground up, and natively supports the standard CNCF observability stack.

To read more about it, check out the official TiDB Introduction.

Hello World with TiDB

For a quick "Hello World" application with Ent+TiDB, follow these steps:

Spin up a local TiDB server by using Docker:
```
docker run -p 4000:4000 pingcap/tidb
```
Now you should have a running instance of TiDB listening on port 4000.

Clone the example hello world repository:

git clone https://github.com/hedwigz/tidb-hello-world.git

In this example repository we defined a simple schema User:

ent/schema/user.go
func (User) Fields() []ent.Field {
    return []ent.Field{
        field.Time("created_at").
            Default(time.Now),
        field.String("name"),
        field.Int("age"),
    }
}

Then, we connected Ent with TiDB:

main.go
client, err := ent.Open("mysql", "root@tcp(localhost:4000)/test?parseTime=true")
if err != nil {
    log.Fatalf("failed opening connection to tidb: %v", err)
}
defer client.Close()
// Run the auto migration tool, with Atlas.
if err := client.Schema.Create(context.Background(), schema.WithAtlas(true)); err != nil {
    log.Fatalf("failed printing schema changes: %v", err)
}

Note that in line 1 we connect to the TiDB server using a mysql dialect. This is possible due to the fact that TiDB is MySQL compatible, and it does not require any special driver.
Having said that, there are some differences between TiDB and MySQL, especially when pertaining to schema migrations, such as information schema inspection and migration planning. For this reason, Atlas automatically detects if it is connected to TiDB and handles the migration accordingly.
In addition, note that in line 7 we used schema.WithAtlas(true), which flags Ent to use Atlas as its migration engine.

Finally, we create a user and save the record to TiDB to later be queried and printed.

main.go
client.User.Create().
    SetAge(30).
    SetName("hedwigz").
    SaveX(context.Background())
user := client.User.Query().FirstX(context.Background())
fmt.Printf("the user: %s is %d years old\n", user.Name, user.Age)

Run the example program:

$ go run main.go
the user: hedwigz is 30 years old

Woohoo! In this quick walk-through we managed to:

Spin up a local instance of TiDB.
Connect Ent with TiDB.
Migrate our Ent schema with Atlas.
Insert and query from TiDB using Ent.

Preview support

The integration of Atlas with TiDB is well tested with TiDB version v5.4.0 (at the time of writing, latest) and we will extend that in the future. If you're using other versions of TiDB or looking for help, don't hesitate to file an issue or join our Discord channel.

For more Ent news and updates:

Subscribe to our Newsletter
Follow us on Twitter
Join us on #ent on the Gophers Slack
Join us on the Ent Discord Server

Setting Up our Ent schema​

Setting Up the database​

Scaffolding the CLI​

Load the documents into the database​

Indexing the embeddings​

Asking questions​

Wrapping up​

TL;DR​

Private vs. Public Visualizations​

Visualizing your Ent Schema with Atlas​

Wrapping up​

Prerequisites​

Step 1: Setting up the database schema​

Step 2: Seeding our database​

Step 3: Creating the home page​

The view​

Server code​

The router​

A unit test​

Putting everything together​

Step 4: Adding content​

Wrapping up​

TL;DR​

Visualizing Ent schemas​

Introducing the new entviz​

Wrapping up​

Versioned migrations​

Creating a JSON column with Ent:​

Start writing to both columns​

Backfill values from old column​

Redirect reads to the new column and delete old blob column​

Delete the old column​

Wrapping up​

TL;DR​

JSON values in SQL databases​

JSON values in Ent​

Appending values to fields in JSON columns​

Appending values to JSON fields with Ent​

Wrapping up​

Verify all generated files are checked in​

Lint migration files​

What's next for Ent CI​

Wrapping up​

The Problem​

The Solution​

Wrapping Up​

Introduction to Ent​

Enthistory​

Pre release​

Wrapping up​

TiDB​

Hello World with TiDB​

Preview support​

Setting Up our Ent schema

Setting Up the database

Scaffolding the CLI

Load the documents into the database

Indexing the embeddings

Asking questions

Wrapping up

TL;DR

Private vs. Public Visualizations

Visualizing your Ent Schema with Atlas

Wrapping up

Prerequisites

Step 1: Setting up the database schema

Step 2: Seeding our database

Step 3: Creating the home page

The view

Server code

The router

A unit test

Putting everything together

Step 4: Adding content

Wrapping up

TL;DR

Visualizing Ent schemas

Introducing the new `entviz`

Wrapping up

Versioned migrations

Creating a JSON column with Ent:

Start writing to both columns

Backfill values from old column

Redirect reads to the new column and delete old blob column

Delete the old column

Wrapping up

TL;DR

JSON values in SQL databases

JSON values in Ent

Appending values to fields in JSON columns

Appending values to JSON fields with Ent

Wrapping up

Verify all generated files are checked in

Lint migration files

What's next for Ent CI

Wrapping up

The Problem

The Solution

Wrapping Up

Introduction to Ent

Enthistory

Pre release

Wrapping up

TiDB

Hello World with TiDB

Preview support