Looker 101 Chapter 4

How to manually add measures and make your own SQL queries in Looker

In this chapter, we are going to learn how you can create and add measures and use them in your visualizations making.                      We’ll learn how to add simple aggregation measures based on a single field right away from the tile making fields menu as a custom field, and then we will dive into the LookML code and write more complex measures that will be added directly as a field to your view.

Finally, we’ll learn how to use Looker’s SQL Runner in order to write and run your own SQL queries in extreme cases –  
For instance, when you want to do something and don’t seem to work it out using Looker’s features and LookML in the process of a dashboard’s tile making. 
For that, we will go over a nice use case of using SQL to make a derived table from an existing one and adding it to its project.
I recommend only using SQL when there is no other choice, since in terms of better performance and simplicity you should stick to LookML. 

*For our use cases we are going to use an example dataset of Worldometer Coronavirus Daily Data. You can learn from the following examples and try it out and implement with your data in your projects later on.

Adding simple measures

To start off, we’ll learn how you can simply add basic measures of aggregations like sum, average, min and max, as a custom field to your explore through the process of making a tile for a dashboard, without even writing LookML. 

I prepared in advance a dashboard to work with, created a connection to BigQuery where our dataset is, and created a project with this connection – all these things we learned to do in the previous chapters.

 

Let’s say we want to present the amount of total cases for each country.

  • we’ll start by adding a new visualization tile to the dashboard (Add->Visualization).
  • Now we need to select the explore we want (in this example – Worldometer Coronavirus Daily Data) under the project we created for it – in this example “Yuval Project 2”. 

 

Looker allows us to create simple measures for our numerical data fields by clicking the “three dot” More button next to the desired field, then clicking Aggregate and selecting the calculation we want to make. 

  • In this case, we want to create a measure of the sum of active cases so for the Active Cases dimension we’ll select Sum. 

  

 

The measure Sum of Active Cases is now added under Custom Fields.
* Notice that for non-numeric data fields such as Country which its type is string, Looker will only let us List and Count Distinct. 

  • Now, all we have to do is add the Country dimension by clicking it, and then Run.

Writing measures in LookML

In many cases, we’ll want measures of more complex calculations than the basic parameters of sum, min, average, etc – we’ll have to manually write those measures in the LookML.

  • To do so, going back to our coronavirus data example use case, when we add a tile to the dashboard and select the explore we want, this time we need to select Go to LookML.  
  • Now we need to go to the view where we want to add our measure field, by clicking on views and selecting the desired view (in this example, worldometer_coronavirus_daily_data)

 

As we talked about in the second chapter of this Looker tutorial series, this is where you see the LookML of your view’s fields – dimensions and measures. 

  • By scrolling down, you’ll get to the default measure of count and to all other measures if there are any.

 

As an example, let’s say we want to create a measure that finds the max alphabet country.

  • In order to create the measure, you can copy-paste the default one as a starting point and work from there – first we give our measure a name that defines what it is about (in this example max_alphabet_country), then, we define its type, which in this case is string. You can remove the drill_fields parameter as it is related to dates and irrelevant for now.

*Notice that even though string is a measure type that does not perform an aggregation, it is still a measure – a calculation is still being made here.

 

  • Now, we need to add the sql statement – where we write the SQL expressions based on the type that will define the measure. The line of code for it starts with “sql:” and then the statement itself. In this example we want to find the max country alphabetically (for that the type is string) so we need to write MAX() and in the brackets, we put the field we want to apply the calculation on in curly braces (country in this case) and a dollar sign before it.

*Remember to end the statement with double semicolons. 

  • When we are done writing the measure we click on Save Changes and should see the button changes to “Saved”. 

Now that our measure is added as a field to the view, we can go back to our dashboard, refresh the page, and now when we add a new tile and select the explore and view we added the measure to, the new measure should appear under MEASURES.

 

  • You can try it out by selecting it and pressing Run. In this case we get Zimbabwe.

 

Let’s go back to our LookML code and walk through another example. This time we’ll see how you can create a measure that its calculation is based on more than one field – for instance for the coronavirus data use case, let’s say we want to get the death ratio. That means we need to divide the number of dead by the number of sick, so we apply the measure on a calculation between two different dimensions.

*The data is inaccurate for our example because not all new deaths are only from new cases but this is only to illustrate the use case.

Later on, we’ll try to use the measure to make a visualization tile with the use of our new death ratio measure, grouped by another dimension and see what happens…

  • Once again we define a measure and give it a name (death_ratio in this case),
    we define the type, here it’s number because we deal with numbers on which we will do the calculation, and in the sql statement we state the dimensions – the one that we want to divide by the other one, the division sign and the other dimension:
    ${“the field’s name”} / ${“the field’s name”}

We now have one important last thing to do to avoid possible errors since it’s a division operation, and that is to make sure we don’t divide by zero!

  • In that case, we need to use the SQL NULLIF function and add it to our sql statement for the dimension in the denominator in the following way: NULLIF(${“the dimension’s name”}, 0)
    It means if the denominator is zero, treat it like NULL instead.

Don’t forget to click Save Changes when done!

Like before, if we go back to adding a tile to our dashboard, after refreshing we should see in the explore the measure we added under MEASURES.

Now, suppose we want to use this measure and make a visualization with it, like getting the death ratio of each country, if we try to select the country field and our measure, and Run, we would get an error.  

The measure performs its calculation on dimensions so Looker get confused… 
In order to make it work we need to help Looker by adding and selecting the fields that our measure is made of (in this example – Daily New Cases and Daily New Deaths).

  • If we try to Run now, we’ll see that the problem is solved and the data appears.
  • We only want the death ratio per country data to be presented visually so as we learned in the previous chapter, we just need to Hide From Visualization the fields we only added for the query to run, that the measure consists of (in this example, Daily New Cases and Daily New Deaths).

 

Working with SQL in Looker

In some cases you’ll might want or have to use and work with SQL in Looker. 
Let’s take as an example, the tile that we made for the first use case we talked about in this chapter of the sum of active cases per country for the Coronavirus data dashboard. To remind you, we started off by selecting the country dimension and creating the measure custom field of sum for the Active Cases dimension.  

Now, suppose you want to partition this data by Rank, and you can’t or don’t know how to do it with LookML, however you do know how to do it with SQL. In that scenario you can take the generated SQL of the data that is already presented and query it to make a Rank partition field.

  • You can do so by clicking SQL on the Data tab and selecting Open in SQL Runner

In here, we’ve got a SQL IDE where we can query the table we already generated by selecting the dimension and measure of countries and sum of cases.

It will be our subquery from which we will be able to make a rank function on the data.

Here is a simple main query to create a basic rank field for the demonstration. When you are done writing your query, you can see the results by pressing Run

Now that we managed to get the wanted results with SQL – in this example the rank field, if we want to use it in a dashboard we need to import this derived table to our project.
To do so, press the cogwheel button and select Add to Project…

  • Now, select the project you want to add this to – in this example the project for the coronavirus data is called “yuval_project_2”.
  • Since it is a new table – based on the view that we queried from but still a table by itself, it will be added to the project as a new view, so we can give it a name and enter it in the View Name pane.
  • When done, click Add.

Now all you need to do is add the view to your model.

  • First you need to drag the view to be under the views folder.
  • Then, select your project’s model and add the view as a new explore to your project by writing the code line as the following:
    explore: “the view’s name” {}
  • After that save by clicking Save Changes

 

That’s all for this chapter.
Hoping to see you in the next one.

If you have Questions regarding looker, feel free to contact us through the website or via Linkedin

Leave a Reply

Discover more from Big Data Demystified

Subscribe now to keep reading and get access to the full archive.

Continue reading