Skip to content

Tenancy

Sometimes you might want to query a different database or schema because might be convenient to get data from two different sources.

Querying two different databases or schemas its not so uncommon as you might think and to this we can sometimes refer to multi-tenancy or simply two different data sources.

What is multi-tenancy

When implementing multi tenancy architecture there are many factors to consider and there are three common ways:

  1. Shared schemas - The data of all users are shared within the same schema and filtered by common IDs or whatever that is unique to the platform. This is not so great for GDPR (Europe) or similar in different countries.
  2. Shared database, different Schemas - The user's data is split by different schemas but live on the same database.
  3. Different databases - The user's data or any data live on different databases.

Edgy has three different ways of achieving this in a simple and clean fashion.

  1. Using the using in the queryset.
  2. Using the using_with_db in the queryset.
  3. Using the set_tenant as global.

You can also use the Edgy helpers for schemas if you need to use it.

The schema creation

Edgy by default provides a functionality that allows the creation of schemas in an easy way by calling create_schema in the tenancy module.

from edgy.core.tenancy.utils import create_schema

This function is very powerful and yet simple to use and allows the creation of schemas based on a given registry.

This means, if you have different registries pointing to different databases, you only need to provide the registry instance and some extra parameters to generate a brand new schema.

Parameters

  • registry - An instance of a Registry.
  • schema_name - The name for the new schema.
  • models - Optional dictionary containing the name of the Edgy Model as key and the model class as value. The reason for being optional its because if not provided, it will generate the tables from the models of the registry object. This allows the flexibility of you manipulating what tables should be passed into the new schema.
  • if_not_exists - If True, the schema will be created only if it does not already exist. Defaults to False.
  • should_create_tables - If True, tables will be created within the new schema. Defaults to False.

Example

import edgy
from edgy.core.tenancy.utils import create_schema

database = edgy.Database("sqlite:///db.sqlite")
registry = edgy.Registry(database=database)

# Create the schema
await create_schema(
    registry=registry,
    schema_name="edgy",
    if_not_exists=True,
    should_create_tables=True
)

Using

This is probably the one that is more commonly used and probably the one you will be using more often when querying different schemas or databases.

The using is simply an instruction telling to use this schema to query the data instead of the default set in the registry.

Parameters:

  • schema - A string parameter with the name of the schema to query.

The syntax is quite simple.

<Model>.query.using(schema=<SCHEMA-NAME>).all()

This is not limited to the all() at all, you can use any of the available query types as well.

Example

Let us assume we have two different schemas inside the same database and those schemas have a table called User.

  • The schema default - The one that is automatically used if no schema is specified in the registry.
  • The schema other - The one we also want to query.
import edgy
from edgy import Database, Registry

database = Database("<YOUR-CONNECTION-STRING>")
models = Registry(database=database)


class User(edgy.Model):
    id: int = edgy.IntegerField(primary_key=True)
    is_active: bool = edgy.BooleanField(default=False)

    class Meta:
        registry = models

Now we want to query the users from each schema.

Querying the default

As per normal approach, the query looks like this.

User.query.all()

Querying the main

Query the users table from the main schema.

User.query.using(schema='main').all()

And that is it, really. Using its a simple shortcut that allows querying different schemas without a lot of boilerplate.

Using with database

Now here it is where the things get interesting. What if you need/want to query a schema but from a different database instead? Well, that is possible with the use of the using_with_db.

This is the part that makes a whole difference if you are thinking about querying a specific database using a diffent connection.

What does that even mean? Imagine you have a main database public (default) and a database copy somewhere else called alternative (or whatever name you choose) and both have the model User.

You now want to query the alternative to gather some user data that was specifically stored in that database where the connection string is different.

The way Edgy operates is by checking if that alternative connection exists in the extra parameter of the registry and then uses that connection to connect and query to the desired database.

Warning

To use the alternative database, the connection must be declared in the registry of the model or else it will raise an AssertationError.

The way of doing that is by using the using_with_db of the queryset. This is particularly useful if you want to do some tenant applications or simply connecting to a different database to gather your data.

Simple right?

Nothing like a good example to simplify those possible confusing thoughts.

Let us assume we want to bulk_create some users in the alternative database instead of the default.

import edgy
from edgy.core.db import fields
from edgy.testclient import DatabaseTestClient as Database

database = Database("<YOUR-CONNECTION-STRING>")
alternative = Database("<YOUR-ALTERNATIVE-CONNECTION-STRING>")
models = edgy.Registry(database=database, extra={"alternative": alternative})


class User(edgy.Model):
    id: int = fields.IntegerField(primary_key=True)
    name: str = fields.CharField(max_length=255)
    email: str = fields.CharField(max_length=255)

    class Meta:
        registry = models

As you can see, the alternative was declared in the extra parameter of the registry of the model as required.

Now we can simply use that connection and create the data in the alternative database.

import edgy
from edgy.core.db import fields
from edgy.testclient import DatabaseTestClient as Database

database = Database("<YOUR-CONNECTION-STRING>")
alternative = Database("<YOUR-ALTERNATIVE-CONNECTION-STRING>")
models = edgy.Registry(database=database, extra={"alternative": alternative})


class User(edgy.Model):
    id: int = fields.IntegerField(primary_key=True)
    name: str = fields.CharField(max_length=255)
    email: str = fields.CharField(max_length=255)

    class Meta:
        registry = models


async def bulk_create_users() -> None:
    """
    Bulk creates some users.
    """
    await User.query.using(database="alternative").bulk_create(
        [
            {"name": "Edgy", "email": "edgy@example.com"},
            {"name": "Edgy Alternative", "email": "edgy.alternative@example.com"},
        ]
    )

Did you notice the alternative name in the using_with_db? Well, that should match the name given in the extra declaration of the registry.

You can have as many connections declared in the extra as you want, there are no limits.

Using with activate_schema

Warning

This feature is experimental and might be inconsistent with the intended results. Use it at your own discretion.

This is an alternative to using and serves solely as the purpose of avoiding writing all the time Model.query.using(...).

You can use activate_schema(...) and deactivate_schema() to tell the query to always query a specific tenant, in other words, using the activate_schema() you don't need to constantly write using(...).

Importing is as simple as this:

from edgy.core.db.querysets.mixins import activate_schema, deactivate_schema

Let us see an example:

With the classic .using()

# Using the 'main' schema

User.query.using(schema='main').all()
User.query.using(schema='main').filter(email__icontains="user@example.com")
User.query.using(schema='main').get(pk=1)

Using the activate_schema

# Using the 'main' schema
activate_schema("main")

# Query the 'User' from the 'main' schema
User.query.all()
User.query.filter(email__icontains="user@example.com")
User.query.get(pk=1)

# Deactivate the schema and default to the public
deactivate_schema("main")

Set tenant

This is another way to create a global tenant for your application. Instead if using or using_with_db you simply want to make sure that in your application you want every request for a specific tenant to always hit their corresponding tenant data.

This is specially useful for multi-tenant applications where your tenant users will only see their own data.

To use the set_tenant you can import it via:

from edgy.core.db import set_tenant

Tip

Use the set_tenant in things like application middlewares or interceptors, right before reaching the API.

Practical case

The set_tenant can be somehow confusing without a proper example so let us run one 😁.

As usual, for this example Esmerald will be used. This can be applied to any framework of your choice of course.

What are we building:

  • Models - Some models that will help us out mapping a user with a tenant.
  • Middleware - Intercept the request and set the corresponding tenant.
  • API - The API that returns the data for a given tenant.
Models

Let us start with some models where we have a Tenant, a User model as well as a Product where we will be adding some data for different tenants.

The TenantUser model will serve as the link between a database schema (tenant) and the User.

We will want to exclude some models from being created in every schema. The Tenant on save it will generate the schema for a user in the database and it will automatically generate the database models.

Warning

This is for explanation purposes, just do in the way you see fit.

from typing import Any, Dict, List, Type, Union

import sqlalchemy
from loguru import logger

import edgy
from edgy.core.db.models.model import Model
from edgy.core.db.models.utils import get_model
from edgy.testclient import DatabaseTestClient as Database

database = Database("<YOUR-CONNECTION-STRING>")
registry = edgy.Registry(database=database)


class Tenant(edgy.Model):
    schema_name: str = edgy.CharField(max_length=63, unique=True, index=True)
    tenant_name: str = edgy.CharField(max_length=100, unique=True, null=False)

    class Meta:
        registry = registry

    def table_schema(
        self,
        model_class: Type["Model"],
        schema: str,
    ) -> sqlalchemy.Table:
        return model_class.table_schema(schema)

    async def create_tables(
        self,
        registry: "edgy.Registry",
        tenant_models: Dict[str, Type["Model"]],
        schema: str,
        exclude: Union[List[str], None],
    ) -> None:
        for name, model in tenant_models.items():
            if name in exclude:
                continue

            table = self.table_schema(model, schema)

            logger.info(f"Creating table '{model.meta.tablename}' for schema: '{schema}'")
            try:
                async with registry.engine.begin() as connection:
                    await connection.run_sync(table.create)
                await registry.engine.dispose()
            except Exception as e:
                logger.error(str(e))

    async def create_schema(self) -> None:
        await registry.schema.create_schema(self.schema_name)

    async def save(
        self,
        force_insert: bool = False,
        values: Union[dict[str, Any], set[str], None] = None,
    ) -> Type[Model]:
        tenant = await super().save(force_insert, values)
        try:
            await self.create_schema(schema=tenant.schema_name, if_not_exists=True)
            await self.create_tables(
                registry, registry.models, tenant.schema_name, exclude=["Tenant", "TenantUser"]
            )
        except Exception as e:
            message = f"Rolling back... {str(e)}"
            logger.error(message)
            await self.delete()
        return tenant


class User(edgy.Model):
    id: int = edgy.IntegerField(primary_key=True)
    name: str = edgy.CharField(max_length=255)

    class Meta:
        registry = registry


class Product(edgy.Model):
    id: int = edgy.IntegerField(primary_key=True)
    name: str = edgy.CharField(max_length=255)
    user: User = edgy.ForeignKey(User, null=True)

    class Meta:
        registry = registry

This is a lot to unwrap is it? Well, that was explained before at the top and this is just the declaration of the models for some general purposes.

Generate example data

Now it is time to generate some example data and populate the tables previously created.

async def create_data():
    """
    Creates mock data.
    """
    # Create some users in the main users table
    esmerald = await User.query.create(name="esmerald")

    # Create a tenant for Edgy (only)
    tenant = await Tenant.query.create(
        schema_name="edgy",
        tenant_name="edgy",
    )

    # Create a user in the `User` table inside the `edgy` tenant.
    edgy = await User.query.using(schema=tenant.schema_name).create(
        name="Edgy schema user",
    )

    # Products for Edgy (inside edgy schema)
    for i in range(10):
        await Product.query.using(schema=tenant.schema_name).create(
            name=f"Product-{i}",
            user=edgy,
        )

    # Products for Esmerald (no schema associated, defaulting to the public schema or "shared")
    for i in range(25):
        await Product.query.create(name=f"Product-{i}", user=esmerald)

We now have models and mock data for those. You will realise that we created a user inside the shared database (no schema associated) and one specifically inside the newly edgy schema.

Middleware

It is time to create a middleware that will take advantage of our new models and tenants and set the tenant automatically.

The middleware will receive some headers with the tenant information and it will lookup if the tenant exist.

Danger

Do not use this example in production, the way it is done it is not safe. A real lookup example would need more validations besides a direct headers check.

from typing import Any, Coroutine

from esmerald import Request
from esmerald.protocols.middleware import MiddlewareProtocol
from lilya.types import ASGIApp, Receive, Scope, Send

from edgy.core.db import set_tenant
from edgy.exceptions import ObjectNotFound


class TenantMiddleware(MiddlewareProtocol):
    def __init__(self, app: "ASGIApp"):
        super().__init__(app)
        self.app = app

    async def __call__(
        self, scope: Scope, receive: Receive, send: Send
    ) -> Coroutine[Any, Any, None]:
        """
        Receives a header with the tenant information and lookup in
        the database if exists.

        Sets the tenant if true, or none otherwise.
        """
        request = Request(scope=scope, receive=receive, send=send)
        tenant_header = request.headers.get("tenant", None)

        try:
            user = await Tenant.query.get(schema_name=tenant_header)
            tenant = user.schema_name
        except ObjectNotFound:
            tenant = None

        set_tenant(tenant)
        await self.app(scope, receive, send)

Now this is getting somewhere! As you could now see, this is where we take advantage of the set_tenant.

In the middleware, the tenant is intercepted and all the calls in the API will now query only the tenant data, which means that there is no need for using or using_with_db anymore.

API

Now it is time to simply create the API that will read the created products from the database and assemble everything.

This will create an Esmerald application, assemble the routes and add the middleware created in the previous step.

from typing import List

from esmerald import Esmerald, Gateway, get

import edgy

database = edgy.Database("<TOUR-CONNECTION-STRING>")
models = edgy.Registry(database=database)


@get("/products")
async def products() -> List[Product]:
    """
    Returns the products associated to a tenant or
    all the "shared" products if tenant is None.
    """
    products = await Product.query.all()
    return products


app = models.asgi(
    Esmerald(
        routes=[Gateway(handler=products)],
        middleware=[TenantMiddleware],
    )
)
Query the API

If you query the API, you should have similar results to this:

import httpx


async def query():
    response = await httpx.get("/products", headers={"tenant": "edgy"})

    # Total products created for `edgy` schema
    assert len(response.json()) == 10

    # Response for the "shared", no tenant associated.
    response = await httpx.get("/products")
    assert len(response.json()) == 25

The data generated for each schema (shared and edgy) should match the response total returned.

Tenant only models

By default a table will be generated for the non-tenancy db schema. If you don't want this, you can set the flag in the meta of the tenant model register_default to False.

Notes

As you could see in the previous step-by=step example, using the set_tenant can be extremely useful mostrly for those large scale applications where multi-tenancy is a must so you can actually take advantage of this.