SQL Lab

SQL Lab is a modern, feature-rich SQL IDE written in React.


_images/sqllab.png

Feature Overview

  • Connects to just about any database backend

  • A multi-tab environment to work on multiple queries at a time

  • A smooth flow to visualize your query results using Superset’s rich visualization capabilities

  • Browse database metadata: tables, columns, indexes, partitions

  • Support for long-running queries

    • uses the Celery distributed queue to dispatch query handling to workers

    • supports defining a “results backend” to persist query results

  • A search engine to find queries executed in the past

  • Supports templating using the Jinja templating language which allows for using macros in your SQL code

Extra features

  • Hit alt + enter as a keyboard shortcut to run your query

Templating with Jinja

SELECT *
FROM some_table
WHERE partition_key = '{{ presto.first_latest_partition('some_table') }}'

Templating unleashes the power and capabilities of a programming language within your SQL code.

Templates can also be used to write generic queries that are parameterized so they can be re-used easily.

Available macros

We expose certain modules from Python’s standard library in Superset’s Jinja context:

  • time: time

  • datetime: datetime.datetime

  • uuid: uuid

  • random: random

  • relativedelta: dateutil.relativedelta.relativedelta

Jinja’s builtin filters can be also be applied where needed.

class superset.jinja_context.ExtraCache(extra_cache_keys: Optional[List[Any]] = None)[source]

Dummy class that exposes a method used to store additional values used in calculation of query object cache keys.

cache_key_wrapper(key: Any) → Any[source]

Adds values to a list that is added to the query object used for calculating a cache key.

This is needed if the following applies:
  • Caching is enabled

  • The query is dynamically generated using a jinja template

  • A JINJA_CONTEXT_ADDONS or similar is used as a filter in the query

Parameters

key – Any value that should be considered when calculating the cache key

Returns

the original value key passed to the function

current_user_id(add_to_cache_keys: bool = True) → Optional[int][source]

Return the user ID of the user who is currently logged in.

Parameters

add_to_cache_keys – Whether the value should be included in the cache key

Returns

The user ID

current_username(add_to_cache_keys: bool = True) → Optional[str][source]

Return the username of the user who is currently logged in.

Parameters

add_to_cache_keys – Whether the value should be included in the cache key

Returns

The username

url_param(param: str, default: Optional[str] = None, add_to_cache_keys: bool = True) → Optional[Any][source]

Read a url or post parameter and use it in your SQL Lab query.

When in SQL Lab, it’s possible to add arbitrary URL “query string” parameters, and use those in your SQL code. For instance you can alter your url and add ?foo=bar, as in {domain}/superset/sqllab?foo=bar. Then if your query is something like SELECT * FROM foo = ‘{{ url_param(‘foo’) }}’, it will be parsed at runtime and replaced by the value in the URL.

As you create a visualization form this SQL Lab query, you can pass parameters in the explore view as well as from the dashboard, and it should carry through to your queries.

Default values for URL parameters can be defined in chart metadata by adding the key-value pair url_params: {‘foo’: ‘bar’}

Parameters
  • param – the parameter to lookup

  • default – the value to return in the absence of the parameter

  • add_to_cache_keys – Whether the value should be included in the cache key

Returns

The URL parameters

superset.jinja_context.filter_values(column: str, default: Optional[str] = None) → List[str][source]

Gets a values for a particular filter as a list

This is useful if:
  • you want to use a filter box to filter a query where the name of filter box column doesn’t match the one in the select statement

  • you want to have the ability for filter inside the main query for speed purposes

Usage example:

SELECT action, count(*) as times
FROM logs
WHERE action in ( {{ "'" + "','".join(filter_values('action_type')) + "'" }} )
GROUP BY action
Parameters
  • column – column/filter name to lookup

  • default – default value to return if there’s no matching columns

Returns

returns a list of filter values

class superset.jinja_context.PrestoTemplateProcessor(database: Database, query: Optional[Query] = None, table: Optional[SqlaTable] = None, extra_cache_keys: Optional[List[Any]] = None, **kwargs: Any)[source]

Presto Jinja context

The methods described here are namespaced under presto in the jinja context as in SELECT '{{ presto.some_macro_call() }}'

first_latest_partition(table_name: str) → Optional[str][source]

Gets the first value in the array of all latest partitions

Parameters

table_name – table name in the format schema.table

Returns

the first (or only) value in the latest partition array

Raises

IndexError – If no partition exists

latest_partition(table_name: str) → Optional[str]

Gets the first value in the array of all latest partitions

Parameters

table_name – table name in the format schema.table

Returns

the first (or only) value in the latest partition array

Raises

IndexError – If no partition exists

latest_partitions(table_name: str) → Optional[List[str]][source]

Gets the array of all latest partitions

Parameters

table_name – table name in the format schema.table

Returns

the latest partition array

class superset.jinja_context.HiveTemplateProcessor(database: Database, query: Optional[Query] = None, table: Optional[SqlaTable] = None, extra_cache_keys: Optional[List[Any]] = None, **kwargs: Any)[source]

Extending macros

As mentioned in the Installation & Configuration documentation, it’s possible for administrators to expose more more macros in their environment using the configuration variable JINJA_CONTEXT_ADDONS. All objects referenced in this dictionary will become available for users to integrate in their queries in SQL Lab.

Customize templating

As mentioned in the Installation & Configuration documentation, it’s possible for administrators to overwrite Jinja templating with your customized template processor using the configuration variable CUSTOM_TEMPLATE_PROCESSORS. The template processors referenced in the dictionary will overwrite default Jinja template processors of the specified database engines.

Query cost estimation

Some databases support EXPLAIN queries that allow users to estimate the cost of queries before executing this. Currently, Presto is supported in SQL Lab. To enable query cost estimation, add the following keys to the “Extra” field in the database configuration:

{
    "version": "0.319",
    "cost_estimate_enabled": true
    ...
}

Here, “version” should be the version of your Presto cluster. Support for this functionality was introduced in Presto 0.319.

You also need to enable the feature flag in your superset_config.py, and you can optionally specify a custom formatter. Eg:

def presto_query_cost_formatter(cost_estimate: List[Dict[str, float]]) -> List[Dict[str, str]]:
    """
    Format cost estimate returned by Presto.

    :param cost_estimate: JSON estimate from Presto
    :return: Human readable cost estimate
    """
    # Convert cost to dollars based on CPU and network cost. These coefficients are just
    # examples, they need to be estimated based on your infrastructure.
    cpu_coefficient = 2e-12
    network_coefficient = 1e-12

    cost = 0
    for row in cost_estimate:
        cost += row.get("cpuCost", 0) * cpu_coefficient
        cost += row.get("networkCost", 0) * network_coefficient

    return [{"Cost": f"US$ {cost:.2f}"}]


DEFAULT_FEATURE_FLAGS = {
    "ESTIMATE_QUERY_COST": True,
    "QUERY_COST_FORMATTERS_BY_ENGINE": {"presto": presto_query_cost_formatter},
}

Create Table As (CTAS)

You can use CREATE TABLE AS SELECT ... statements on SQLLab. This feature can be toggled on and off at the database configuration level.

Note that since CREATE TABLE.. belongs to a SQL DDL category. Specifically on PostgreSQL, DDL is transactional, this means that to properly use this feature you have to set autocommit to true on your engine parameters:

{
    ...
    "engine_params": {"isolation_level":"AUTOCOMMIT"},
    ...
}