Looker 101 Chapter 5

Join tables in Looker

In this chapter, we are going to learn how to join tables in Looker using the LookML in the data model. We’ll see how by joining views we can make tables that have a connection and relationship with each other, be under the same explore so we could make dashboard tiles that can combine fields from each of them and ask cross-table business questions.

First, we will get to know a use case dataset that we are going to work with for our examples and demonstration of the subject.

  

Commercial magical pony farming use case overview

In order to demonstrate joins in Looker, we are going to go through a really nice use case of a commercial magical pony farm.

Here is the ERD table that presents all of the relevant tables of this use case, their fields and field types, the relationships and connections between the different tables:

Let’s go over the tables themselves in the BigQuery, briefly:
pony_fact is our main table as it contains the personal information of each pony of our 15,000 ponies – its name, height, weight, color… and, of course, its ID. 

pony_treatment_hist contains the treatment history data of the ponies – the pony ID that tells us which pony it is about, what treatment he received and on what date it was. 

pony_rltn contains the data of relationships between the various ponies – what feelings the pony with the SOURCE_PONY_ID has towards the pony with the TARGET_PONY_ID, described under the RELATIONSHIP_TYPE, and what’s the INTENSITY of those feelings.

pony_power contains the data about powers the ponies have – the ID of the pony to recognize which pony it is about, what power it has and the ID of this kind of power.

Joins

Now that we got familiar with our magical pony farm dataset, lets go over to Looker.
At first, when you create the connection between Looker and the dataset, each table is standalone. 
In this section, we are going to see how to connect the tables with joins to be under the same Looker explore so we will be able to make dashboard tiles that combine the data from all of the tables.

I already created a project with a connection to the magical pony farm dataset in BigQuery, and created a new dashboard to work with that which I named-  Joins_Pony_Farm.

*We learned to do all this in previous chapters.

If we add a tile (by clicking Add and then Visualization), we can see that under the project that contains our pony farm dataset, (that I named “Commercial Magical Pony Farm”) each table is built in as a separated explore.  

Let’s say we want to make a graph of the amount of ponies that received a treatment for each color. If we look at the tables, we can see that the colors information and the treatment data, each of them, is in a different table – pony_fact and pony_treatment_hist. It means that we can’t reach and select both fields trying to make such tile, because they are not in the same explore – we need to join the views.

In here, we can see that if we try to select the Pony Fact explore to make a tile, we only have the Color dimension and don’t have access to the treatment type dimension.

(Selecting the Pony Treatment Hist explore would get us a similar problem…)

  • In order to make joins, we need to go to the LookML of our project, what we can do from the tile making after selecting one of the relevant explores, by clicking on Go to LookML.

In here we can edit our project’s model, where we can see that as we figured, every view is defined as an explore.


In our magical pony farm use case, as you could see in the ERD table, the main table is pony_fact – we need to join the pony_treatment_hist to it.

  • To do so, for the pony_fact explore, inside its curly brackets, we need to declare a join by entering “join: {}”, and then following the colon with the name of the table we want to join, so in our case it’s going to be pony_treatment_hist.
  • Now, inside the joined table’s curly brackets (pony_treatment_hist in this example), we need to describe the join with parameters- 
    first, the type of the join, similar to SQL, by writing “type: “ and then the join type you want. In our example we want to left join so we select left_outer. 

Then, we declare the join relationship – it describe the join relationship between joined views based on the primary key.  

  • I highly recommend you to go over the Looker documentation to make sure you understand what kind of relationship fits the use case.
  • Once again, we declare the parameter (“relationship: “) and enter the type of relationship we want – In our example, we’ll go with many_to_one.

Finally and most importantly, we declare an sql_on statement in which we define the primary key for the join to be based on.

  • To do so, we enter the following:
    sql_on: ${first view name.primary key} = ${second view name.primary key} ;;”
    *Don’t forget to add the double semicolons at the end of the statement!

In our example, the primary key would be pony_id, since both of the pony_fact and pony_treatment_hist tables contain it. 

  • When done writing the join, make sure to click Save Changes.

Now, if we return to our dashboard, and add a new tile, after we select the Pony fact explore, we can see that besides the Pony Fact fields, the Pony Treatment Hist view and its fields are added and accessible!

  • So, now we can make the tile of the amount of pony treatments per pony color by selecting the Color dimension from the Pony Fact view, and making a count measure on the Pony ID field of the Pony Treatment Hist view as we learned to do in the previous chapter.

After running and saving, we will have ourselves the desired chart in our dashboard. 

Another thing we should do, since the Pony Fact is now an explore that contains two views including the Pony Fact view itself, is change the explore’s name to avoid confusion.

  • To do so, first, you need to go back to the LookML of the project where the explore is, which you can do just the way we did before when we created the join, or you can do it in another way by returning to the Looker homepage and from the menu selecting Develop, and then clicking on the relevant project.
  • Now, all we need to do is in the model, add a line inside the explore’s curly braces, in which we will enter the parameter – “label: ” and then the new name we want to give the explore inside apostrophes. When done, we click Save Changes

(In this example, our explore will be now named as “Pony Facts Explore”).

If we go to the explores list, we will see the originally named Pony Fact explore is now shown with the name of Pony Facts Explore, and The Pony Fact view’s name will remain the same.

In our Magical Pony Farm use case, as we can see in the ERD table, all of the tables should be joined to the main one which is PONY_FACT so they will all be under the same explore. We can simply make another joins by declaring another joins in the LookML of the project’s model – let’s go back to it.

  • Looking again on the ERD table, we see that the PONY_POWER table contains the PONY_ID field so we can use it as the primary key again.
  • For the PONY_RLTN we can use the SOURCE_PONY_ID as the primary key since it fits to the PONY_ID of the PONY_FACT table.  

To add more joins, we can copy-paste the one we created before as a starting point and work from there.

  • All we have to do now for each join, since the type and relationship in our case stays the same, is change to the table name of the table that we want to join, and in the sql_on statement, set to the name of that table and to its primary key name. When done, we click Save Changes.
  • As remembered, in the beginning when you import the dataset to Looker, each table is standalone and defined as a separate explore in the project’s model. Now that all of our Pony use case tables are joined to the main one and under the same explore, we can delete all of the other explores that are one view each, that we already have under the Pony Facts Explore. Again, Save Changes ,when done.

If we go back to our explores menu, we can see that now we have only one – the Pony Facts Explore, that contains all of the views.

  • Let’s finish up with another dashboard tile example – Suppose we want to see if there is a connection between the food we give our ponies to their having powers. Now that the pony_power and pony_fact are under the same explore, we can make such chart! We’ll add a tile to the dashboard and select Pony Facts Explore.
  • We need to select the Food dimension of the Pony Fact view and make a count measure on the Pony ID of the Pony Power view.

After running and saving, we will have ourselves the chart in our dashboard.

This sums up this chapter and our Looker 101 Introduction series.
Hope you enjoyed and found it helpful!

If you have Questions regarding looker, feel free to contact us through the website or via Linkedin

Leave a Reply

Discover more from Big Data Demystified

Subscribe now to keep reading and get access to the full archive.

Continue reading