Skip to main content

Transformer

Overview

Transformer is a dbt project. Each model is a simple SQL query with minor syntactic sugar (meant to capture dependencies and help build the resulting tables), and does a small part of the task of turning raw and decoded records into interpretable blockchain data.

Source code: https://github.com/untangledfinance/dbt-warehouse-transform

Technical: DBT (please read carefully document of DBT to understand how are the models created?)

Language: SQL

Features: Transform source data (include raw data, decoded data) to transform table to serve business by using SQL language

Execution

  • Run on docker as docker container in local
  • Run by scheduler of Airflow

How to use dbt to create transformer

There are a couple of new concepts to consider when making spells in dbt. The most common ones wizards will encounter are refs, sources, freshness, and tests.

In the body of each query, tables are referred to either as refs, e.g., {{ ref('1inch_ethereum') }}, or sources, e.g., {{ source('ethereum', 'traces') }}. Refs refer to other dbt models and should refer to the file name, like 1inch_ethereum.sql, even if the model itself is aliased. Sources refer to "raw" data or tables/views not generated by dbt. Using refs and sources allows us to automatically build dependency trees.

Sources and models are defined in schema.yml files where tests and other attributes are also defined.

The best practice is to add tests unique and non_null tests to the primary key for every new model. Similarly, a freshness check should be added to every new source (although we will try not to re-test freshness if the source is used elsewhere).

Adding descriptions to tables and columns will help people find and use your tables.

models:
- name: 1inch_ethereum
description: "Trades on 1inch, a DEX aggregator"
columns:
- name: tx_hash
description: "Table primary key: a transaction hash (tx_hash) is a unique identifier for a transaction."
tests:
- unique
- not_null

sources:
- name: ethereum
freshness:
warn_after: { count: 12, period: hour }
error_after: { count: 24, period: hour }
tables:
- name: traces
loaded_at_field: block_time

Generating and serving documentation:

To generate documentation and view it as a website, run the following commands:

dbt docs generate dbt docs serve

As a preview, you can do things like:

  • Write simple one or many line descriptions of models or columns.
  • Write longer descriptions as code blocks using markdown.
  • Link to other models in your descriptions.
  • Add images / project logos from the repo into descriptions.
  • Use HTML in your description.

Example for simple transformer

Step 1: Build sql model that including config and query

Step 2: Check source base of this sql model

Step 3: Ensure that tables in Node have been decoded and are available in the data warehouse

  • If they are not exist, we need decode first

Step 4: Ensure that tables in Macros are located in the correct directory, as specified for Dune.

Step 5: Run model

  • dbt run --select [model]