Metabase API
The Metabase API can preform any task that can be performed in the GUI, but the documentation is not that great and they don’t provide much in the way of examples. After providing some very brief examples in Python, R, and Node.js they tell you that the best way to learn about the API is to open Chrome developer tools and look at the request and response fields from the Network tab.
The first step is to get the session id from Metabase using the
username and password shared with you in 1Password. Replace the
xxx
and yyy
with the username and password.
I have this section commented out since I already have a session
id. Metabase recommends
storing the session id and using it for the full 14 day period it is
valid rather than getting a new one each time to prevent rate limiting
from too many logins.
import requests
import pandas as pd
# The next step is to get a session token to authenticate all future requests.
# (The example uses my credentials—you can replace them with your username and password,
# but be sure that you don’t commit those values to a version control repository.)
# To get the token out of the result, we convert the latter to JSON and look up the ID.
# We store this in a dictionary with the right key for future use:
= 'https://test.metabase.data.paritytech.io/'
baseurl # response = requests.post(f'{baseurl}api/session',
# json={'username': xxx,
# 'password': yyy})
= '3ae7b1e2-fc51-44a9-8b4c-348e4e3e292c' #response.json()['id']
session_id = {'X-Metabase-Session': session_id} headers
Looking at a question using the API
Jumping right in, let’s look at a question that already exists before we try to create a new one. All interactions with the API are made using post, get, and put commands. In the api questions are called cards, so here is how we look at question #2:
id = 2
= requests.get(f'{baseurl}api/card/{id}', headers=headers).json() response
Really long JSON strings are hard for me to read, but the API docs do provide the parameters you need to create a question, so lets use those parameters to read the response from question #2:
= ['visualization_settings','parameters','description','collection_position','result_metadata','collection_id','name','cache_ttl','dataset_query','parameter_mappings','display']
params for p in params:
print("")
print(f"__{p}__: {response[p]}")
##
## __visualization_settings__: {'graph.show_values': True, 'graph.y_axis.title_text': 'Transfer Total Amount KSM', 'graph.x_axis.axis_enabled': False, 'graph.label_value_formatting': 'compact', 'graph.dimensions': ['month', 'chain'], 'column_settings': {'["name","sum"]': {'number_style': 'currency'}}, 'graph.metrics': ['sum']}
##
## __parameters__: []
##
## __description__: API test
##
## __collection_position__: None
##
## __result_metadata__: [{'description': None, 'semantic_type': None, 'coercion_strategy': None, 'unit': 'day', 'name': 'month', 'settings': None, 'field_ref': ['field', 148, {'temporal-unit': 'day'}], 'effective_type': 'type/Date', 'id': 148, 'visibility_type': 'normal', 'display_name': 'Month', 'fingerprint': {'global': {'distinct-count': 17, 'nil%': 0.0}, 'type': {'type/DateTime': {'earliest': '2021-12-31T00:00:00Z', 'latest': '2023-04-30T00:00:00Z'}}}, 'base_type': 'type/Date'}, {'description': None, 'semantic_type': None, 'coercion_strategy': None, 'name': 'chain', 'settings': None, 'field_ref': ['field', 144, None], 'effective_type': 'type/Text', 'id': 144, 'visibility_type': 'normal', 'display_name': 'Chain', 'fingerprint': {'global': {'distinct-count': 1, 'nil%': 0.0}, 'type': {'type/Text': {'percent-json': 0.0, 'percent-url': 0.0, 'percent-email': 0.0, 'percent-state': 0.0, 'average-length': 5.0}}}, 'base_type': 'type/Text'}, {'display_name': 'Sum of Transfer Total Amount Usd', 'semantic_type': None, 'settings': None, 'field_ref': ['aggregation', 0], 'name': 'sum', 'base_type': 'type/Float', 'effective_type': 'type/Float', 'fingerprint': {'global': {'distinct-count': 17, 'nil%': 0.0}, 'type': {'type/Number': {'min': 397379.29242924, 'q1': 147617725.47246593, 'q3': 850366889.0496614, 'max': 3788999852.92558, 'sd': 1045910391.165927, 'avg': 736720053.6049192}}}}]
##
## __collection_id__: None
##
## __name__: API question
##
## __cache_ttl__: None
##
## __dataset_query__: {'database': 2, 'query': {'source-table': 18, 'aggregation': [['sum', ['field', 145, None]]], 'breakout': [['field', 148, {'temporal-unit': 'day'}], ['field', 144, None]], 'filter': ['=', ['field', 144, None], 'astar']}, 'type': 'query'}
##
## __parameter_mappings__: []
##
## __display__: bar
Here we see that the JSON keys we are most interested in are:
- visualization_settings
- result_metadata
- dataset_query
All the other keys are simple text blocks like name and description, or can be left blank.
Create a new question
Lets use the same data from question #2 to make a new question with a different name and description:
= {'visualization_settings': response['visualization_settings'],
data 'parameters': None,
'description': 'API test to archive',
'collection_position': None,
'result_metadata': response['result_metadata'],
'collection_id': None,
'name': 'API question to archive',
'cache_ttl': None,
'dataset_query': response['dataset_query'],
'parameter_mappings': None,
'display': 'bar'}
= requests.post(f'{baseurl}api/card', json = data, headers=headers)
test test
## <Response [200]>
The response code of 200 means that the new question was created and you should see it when you refresh Metabase (you are playing along, right?).
Modify a question
Next I wanted to learn how to modify a question and since I was going
to be creating a bunch of questions for testing purposes, I also wanted
to know how to delete them. Turns out these two tasks are accomplished
very similarly. In the example below we use most of the settings for
visualization_settings from question #2 above and only
change what we need to change (in this case the
graph.y_axis.title_text which I change to
KSM instead of the correct value of
USD. In the data argument to request, I only pass the
changed information. I also pass the id
. I am not sure if
this is necessary since the id
is also referenced in the
URL. Here is the example:
id = 2
= {'graph.show_values': True, 'graph.y_axis.title_text': 'Transfer Total Amount KSM', 'graph.x_axis.axis_enabled': False, 'graph.label_value_formatting': 'compact', 'graph.dimensions': ['month', 'chain'], 'column_settings': {'["name","sum"]': {'number_style': 'currency'}}, 'graph.metrics': ['sum']}
visualization_settings = {'visualization_settings': visualization_settings,
data 'id': id}
= requests.put(f'{baseurl}api/card/{id}', json = data, headers=headers)
test test
## <Response [200]>
Delete a question
The API has an endpoint to delete
a card, but it is deprecated. The docs say to archive the card
instead, but it doesn’t give any example. I looked around and couldn’t
find an example. As a lark I tried updating archived
field
to true and it worked. Here is the example:
id = 8
= {'archived': True,
data 'id': id}
= requests.put(f'{baseurl}api/card/{id}', json = data, headers=headers)
test test
## <Response [200]>
The way forward
Looking at the JSON returned from the first card, we see that the
database, tables, fields, and questions are all referenced using
id
numbers, so the main task is to find all the appropriate
id
numbers we need to build new questions. To that end, I
have written some helper functions that return the id
numbers for each target. Here are the functions:
def metabase_questions(id=None):
if id == None:
= requests.get(f'{baseurl}api/card', headers=headers).json()
response = pd.DataFrame(response)
df print(df[['id', 'name']])
return df
else:
print(requests.get(f'{baseurl}api/card/{id}', headers=headers).json())
def metabase_databases(id=None):
if id == None:
= requests.get(f'{baseurl}api/database/', headers=headers).json()
response = pd.DataFrame(response['data'])
df print(df[['id', 'name']])
return df
else:
print(requests.get(f'{baseurl}api/database/{id}', headers=headers).json())
def metabase_fields(id):
= requests.get(f'{baseurl}api/database/{id}/fields', headers=headers).json()
response = pd.DataFrame(response)
df print(df[['id', 'table_name', 'name', 'display_name']])
return df
def metabase_tables():
= requests.get(f'{baseurl}api/table', headers=headers).json()
response = pd.DataFrame(response)
df print(df[['id', 'name', 'display_name']])
return df
def metabase_dashboards(id=None):
if id == None:
= requests.get(f'{baseurl}api/dashboard', headers=headers).json()
response = pd.DataFrame(response)
df print(df[['id', 'name']])
return df
else:
print(requests.get(f'{baseurl}api/dashboard/{id}', headers=headers).json())
Some of the functions accept an id
as a parameter and
some do not. If you call any of these helper functions without the
id
parameter you will get a DataFrame with all the
id
s listed. If you supply the id
number the
function will print out the JSON object. Some examples follow.
Dashboards
This function will list all the dashboards. Currently there is only one so it is a short list.
= metabase_dashboards() tmp
## id name
## 0 2 Test Dashboard
Databases
This function will list all the datasets available. Currently we have two.
= metabase_databases() tmp
## id name
## 0 2 Analytics PGSQL POC
## 1 3 parity-data-infra-evaluation
Questions
This function lists all the questions.
= metabase_questions() tmp
## id name
## 0 2 API question
## 1 10 API question to archive
## 2 9 API question to archive
## 3 11 API question to archive
## 4 12 API question to archive
## 5 16 API question to archive
## 6 13 API question to archive
## 7 14 API question to archive
## 8 15 API question to archive
## 9 1 Test question
Tables
This function will list all the tables available in Metabase.
= metabase_tables() tmp
## id name display_name
## 0 835 acala Acala
## 1 617 acala Acala
## 2 2039 acala Acala
## 3 1204 acala_blocks Acala Blocks
## 4 1033 acala_blocks_debug Acala Blocks Debug
## ... ... ... ...
## 2211 711 xcm_moonbeam_accounts Xcm Moonbeam Accounts
## 2212 1391 xcm_polkadot_accounts Xcm Polkadot Accounts
## 2213 181 xcm_sent_to_moonbeam Xcm Sent To Moonbeam
## 2214 1662 xcm_weekly Xcm Weekly
## 2215 938 xcm_weekly_view Xcm Weekly View
##
## [2216 rows x 3 columns]
Fields
If you already know the table number and just want to associated fields, use this function and pass in the table id.
= metabase_fields(2) tmp
## id ... display_name
## 0 72 ... Chain
## 1 73 ... ID
## 2 74 ... Date
## 3 75 ... NumTransactions
## 4 76 ... Extract
## .. ... ... ...
## 243 51250 ... Fee Usd Signed
## 244 51251 ... Numtransactions Signed
## 245 51252 ... ID
## 246 51253 ... Extract
## 247 51254 ... Chain
##
## [248 rows x 4 columns]
This is as far as I have gotten on this journey, but now that we know how to get all the table ids and field ids we need, we can start making some questions programmatically and see how they work. Then we can make some by hand and then read the JSON to learn more about how to make new questions. Next time to need to change the label on a dashboard with 20 questions, maybe we can write code to do it for us.
TODO: Learn how to use Chrome developer tools to get more insights into the API commands (if needed).