Data Science

Boost Your AI With Quality Data

Boost Your AI With Quality Data

Lecturer: Magdalena Konkiewicz 20.9.2022

All AI projects start with data — no matter how simple your idea is, you cannot develop machine learning algorithms without examples to train them on. And after the first prototype, when the chase for metrics improvement begins, you find out that the amount and quality of your data matters. That is when a good data labeling pipeline will probably help you a lot.
In this talk we give an introduction to building data labeling pipelines and present real life use cases from different areas such as search relevance, content moderation, voice assistants and self-driving cars. We will explain how to fight concept drift in machine learning, how to build complex products using human-in-loop model and how to remove people management from the data labeling process. 

Lecturer: Magdalena Konkiewicz, a Data Evangelist at Toloka which is a global data labeling company servicing the needs of approximately 2,000 large and small businesses worldwide.
Toloka helps its customers generate machine learning data at scale by harnessing the wisdom of the crowd from around the world.
Toloka is used by organizations in e-commerce, R&D, banking, autonomous vehicles, web services, and more.
Toloka relies on a geographically diverse crowd of several million registered users – 200,000 of which are active monthly, on average. The company is incorporated in Switzerland and has its global headquarters in the USA. Magdalena prior to joining Toloka has worked in many different sectors in technical roles such as NLP Engineer, Developer, and Data Scientist. She has also been involved in teaching and mentoring Data Scientists. Additionally, she contributes to one of the biggest Medium publications Towards Data Science writing about Machine Learning tools and best practices.

Video

Slides


——————————————————————————————————————————
I put a lot of thoughts into these blogs, so I could share the information in a clear and useful way.
If you have any comments, thoughts, questions, or you need someone to consult with,

feel free to contact me via LinkedIn – Omid Vahdaty:

Logo Jutomate
airflow

Airflow-distinguish between environments

Airflow - distinguish between environments

Authors: Omid Vahdaty and Amir Miller 7.9.2022

our use case (in 1min):

We have Airflow (> 2.0) in two environments – PROD and DEVELOPMENT. Let’s distinguish between the two:
in airflow.cfg under [webserver] –> insert new parameter: 
instance_name = <choose the env name> (e.g instance_name = DevEnv) 

Want to make it better? change the navbar_color parameter (e.g navbar_color = #50C878)


——————————————————————————————————————————
I put a lot of thoughts into these blogs, so I could share the information in a clear and useful way.
If you have any comments, thoughts, questions, or you need someone to consult with,

feel free to contact me via LinkedIn – Omid Vahdaty:

Logo Jutomate
BI

Join tables in Looker

Looker 101 Chapter 5

Join tables in Looker

In this chapter, we are going to learn how to join tables in Looker using the LookML in the data model. We’ll see how by joining views we can make tables that have a connection and relationship with each other, be under the same explore so we could make dashboard tiles that can combine fields from each of them and ask cross-table business questions.

First, we will get to know a use case dataset that we are going to work with for our examples and demonstration of the subject.

  

Commercial magical pony farming use case overview

In order to demonstrate joins in Looker, we are going to go through a really nice use case of a commercial magical pony farm.

Here is the ERD table that presents all of the relevant tables of this use case, their fields and field types, the relationships and connections between the different tables:

Let’s go over the tables themselves in the BigQuery, briefly:
pony_fact is our main table as it contains the personal information of each pony of our 15,000 ponies – its name, height, weight, color… and, of course, its ID. 

pony_treatment_hist contains the treatment history data of the ponies – the pony ID that tells us which pony it is about, what treatment he received and on what date it was. 

pony_rltn contains the data of relationships between the various ponies – what feelings the pony with the SOURCE_PONY_ID has towards the pony with the TARGET_PONY_ID, described under the RELATIONSHIP_TYPE, and what’s the INTENSITY of those feelings.

pony_power contains the data about powers the ponies have – the ID of the pony to recognize which pony it is about, what power it has and the ID of this kind of power.

Joins

Now that we got familiar with our magical pony farm dataset, lets go over to Looker.
At first, when you create the connection between Looker and the dataset, each table is standalone. 
In this section, we are going to see how to connect the tables with joins to be under the same Looker explore so we will be able to make dashboard tiles that combine the data from all of the tables.

I already created a project with a connection to the magical pony farm dataset in BigQuery, and created a new dashboard to work with that which I named-  Joins_Pony_Farm.

*We learned to do all this in previous chapters.

If we add a tile (by clicking Add and then Visualization), we can see that under the project that contains our pony farm dataset, (that I named “Commercial Magical Pony Farm”) each table is built in as a separated explore.  

Let’s say we want to make a graph of the amount of ponies that received a treatment for each color. If we look at the tables, we can see that the colors information and the treatment data, each of them, is in a different table – pony_fact and pony_treatment_hist. It means that we can’t reach and select both fields trying to make such tile, because they are not in the same explore – we need to join the views.

In here, we can see that if we try to select the Pony Fact explore to make a tile, we only have the Color dimension and don’t have access to the treatment type dimension.

(Selecting the Pony Treatment Hist explore would get us a similar problem…)

  • In order to make joins, we need to go to the LookML of our project, what we can do from the tile making after selecting one of the relevant explores, by clicking on Go to LookML.

In here we can edit our project’s model, where we can see that as we figured, every view is defined as an explore.


In our magical pony farm use case, as you could see in the ERD table, the main table is pony_fact – we need to join the pony_treatment_hist to it.

  • To do so, for the pony_fact explore, inside its curly brackets, we need to declare a join by entering “join: {}”, and then following the colon with the name of the table we want to join, so in our case it’s going to be pony_treatment_hist.
  • Now, inside the joined table’s curly brackets (pony_treatment_hist in this example), we need to describe the join with parameters- 
    first, the type of the join, similar to SQL, by writing “type: “ and then the join type you want. In our example we want to left join so we select left_outer. 

Then, we declare the join relationship – it describe the join relationship between joined views based on the primary key.  

  • I highly recommend you to go over the Looker documentation to make sure you understand what kind of relationship fits the use case.
  • Once again, we declare the parameter (“relationship: “) and enter the type of relationship we want – In our example, we’ll go with many_to_one.

Finally and most importantly, we declare an sql_on statement in which we define the primary key for the join to be based on.

  • To do so, we enter the following:
    sql_on: ${first view name.primary key} = ${second view name.primary key} ;;”
    *Don’t forget to add the double semicolons at the end of the statement!

In our example, the primary key would be pony_id, since both of the pony_fact and pony_treatment_hist tables contain it. 

  • When done writing the join, make sure to click Save Changes.

Now, if we return to our dashboard, and add a new tile, after we select the Pony fact explore, we can see that besides the Pony Fact fields, the Pony Treatment Hist view and its fields are added and accessible!

  • So, now we can make the tile of the amount of pony treatments per pony color by selecting the Color dimension from the Pony Fact view, and making a count measure on the Pony ID field of the Pony Treatment Hist view as we learned to do in the previous chapter.

After running and saving, we will have ourselves the desired chart in our dashboard. 

Another thing we should do, since the Pony Fact is now an explore that contains two views including the Pony Fact view itself, is change the explore’s name to avoid confusion.

  • To do so, first, you need to go back to the LookML of the project where the explore is, which you can do just the way we did before when we created the join, or you can do it in another way by returning to the Looker homepage and from the menu selecting Develop, and then clicking on the relevant project.
  • Now, all we need to do is in the model, add a line inside the explore’s curly braces, in which we will enter the parameter – “label: ” and then the new name we want to give the explore inside apostrophes. When done, we click Save Changes

(In this example, our explore will be now named as “Pony Facts Explore”).

If we go to the explores list, we will see the originally named Pony Fact explore is now shown with the name of Pony Facts Explore, and The Pony Fact view’s name will remain the same.

In our Magical Pony Farm use case, as we can see in the ERD table, all of the tables should be joined to the main one which is PONY_FACT so they will all be under the same explore. We can simply make another joins by declaring another joins in the LookML of the project’s model – let’s go back to it.

  • Looking again on the ERD table, we see that the PONY_POWER table contains the PONY_ID field so we can use it as the primary key again.
  • For the PONY_RLTN we can use the SOURCE_PONY_ID as the primary key since it fits to the PONY_ID of the PONY_FACT table.  

To add more joins, we can copy-paste the one we created before as a starting point and work from there.

  • All we have to do now for each join, since the type and relationship in our case stays the same, is change to the table name of the table that we want to join, and in the sql_on statement, set to the name of that table and to its primary key name. When done, we click Save Changes.
  • As remembered, in the beginning when you import the dataset to Looker, each table is standalone and defined as a separate explore in the project’s model. Now that all of our Pony use case tables are joined to the main one and under the same explore, we can delete all of the other explores that are one view each, that we already have under the Pony Facts Explore. Again, Save Changes ,when done.

If we go back to our explores menu, we can see that now we have only one – the Pony Facts Explore, that contains all of the views.

  • Let’s finish up with another dashboard tile example – Suppose we want to see if there is a connection between the food we give our ponies to their having powers. Now that the pony_power and pony_fact are under the same explore, we can make such chart! We’ll add a tile to the dashboard and select Pony Facts Explore.
  • We need to select the Food dimension of the Pony Fact view and make a count measure on the Pony ID of the Pony Power view.

After running and saving, we will have ourselves the chart in our dashboard.

This sums up this chapter and our Looker 101 Introduction series.
Hope you enjoyed and found it helpful!

If you have Questions regarding looker, feel free to contact us through the website or via Linkedin

Logo Jutomate
BI

How to manually add measures and make your own SQL queries in Looker

Looker 101 Chapter 4

How to manually add measures and make your own SQL queries in Looker

In this chapter, we are going to learn how you can create and add measures and use them in your visualizations making.                      We’ll learn how to add simple aggregation measures based on a single field right away from the tile making fields menu as a custom field, and then we will dive into the LookML code and write more complex measures that will be added directly as a field to your view.

Finally, we’ll learn how to use Looker’s SQL Runner in order to write and run your own SQL queries in extreme cases –  
For instance, when you want to do something and don’t seem to work it out using Looker’s features and LookML in the process of a dashboard’s tile making. 
For that, we will go over a nice use case of using SQL to make a derived table from an existing one and adding it to its project.
I recommend only using SQL when there is no other choice, since in terms of better performance and simplicity you should stick to LookML. 

*For our use cases we are going to use an example dataset of Worldometer Coronavirus Daily Data. You can learn from the following examples and try it out and implement with your data in your projects later on.

Adding simple measures

To start off, we’ll learn how you can simply add basic measures of aggregations like sum, average, min and max, as a custom field to your explore through the process of making a tile for a dashboard, without even writing LookML. 

I prepared in advance a dashboard to work with, created a connection to BigQuery where our dataset is, and created a project with this connection – all these things we learned to do in the previous chapters.

 

Let’s say we want to present the amount of total cases for each country.

  • we’ll start by adding a new visualization tile to the dashboard (Add->Visualization).
  • Now we need to select the explore we want (in this example – Worldometer Coronavirus Daily Data) under the project we created for it – in this example “Yuval Project 2”. 

 

Looker allows us to create simple measures for our numerical data fields by clicking the “three dot” More button next to the desired field, then clicking Aggregate and selecting the calculation we want to make. 

  • In this case, we want to create a measure of the sum of active cases so for the Active Cases dimension we’ll select Sum. 

  

 

The measure Sum of Active Cases is now added under Custom Fields.
* Notice that for non-numeric data fields such as Country which its type is string, Looker will only let us List and Count Distinct. 

  • Now, all we have to do is add the Country dimension by clicking it, and then Run.

Writing measures in LookML

In many cases, we’ll want measures of more complex calculations than the basic parameters of sum, min, average, etc – we’ll have to manually write those measures in the LookML.

  • To do so, going back to our coronavirus data example use case, when we add a tile to the dashboard and select the explore we want, this time we need to select Go to LookML.  
  • Now we need to go to the view where we want to add our measure field, by clicking on views and selecting the desired view (in this example, worldometer_coronavirus_daily_data)

 

As we talked about in the second chapter of this Looker tutorial series, this is where you see the LookML of your view’s fields – dimensions and measures. 

  • By scrolling down, you’ll get to the default measure of count and to all other measures if there are any.

 

As an example, let’s say we want to create a measure that finds the max alphabet country.

  • In order to create the measure, you can copy-paste the default one as a starting point and work from there – first we give our measure a name that defines what it is about (in this example max_alphabet_country), then, we define its type, which in this case is string. You can remove the drill_fields parameter as it is related to dates and irrelevant for now.

*Notice that even though string is a measure type that does not perform an aggregation, it is still a measure – a calculation is still being made here.

 

  • Now, we need to add the sql statement – where we write the SQL expressions based on the type that will define the measure. The line of code for it starts with “sql:” and then the statement itself. In this example we want to find the max country alphabetically (for that the type is string) so we need to write MAX() and in the brackets, we put the field we want to apply the calculation on in curly braces (country in this case) and a dollar sign before it.

*Remember to end the statement with double semicolons. 

  • When we are done writing the measure we click on Save Changes and should see the button changes to “Saved”. 

Now that our measure is added as a field to the view, we can go back to our dashboard, refresh the page, and now when we add a new tile and select the explore and view we added the measure to, the new measure should appear under MEASURES.

 

  • You can try it out by selecting it and pressing Run. In this case we get Zimbabwe.

 

Let’s go back to our LookML code and walk through another example. This time we’ll see how you can create a measure that its calculation is based on more than one field – for instance for the coronavirus data use case, let’s say we want to get the death ratio. That means we need to divide the number of dead by the number of sick, so we apply the measure on a calculation between two different dimensions.

*The data is inaccurate for our example because not all new deaths are only from new cases but this is only to illustrate the use case.

Later on, we’ll try to use the measure to make a visualization tile with the use of our new death ratio measure, grouped by another dimension and see what happens…

  • Once again we define a measure and give it a name (death_ratio in this case),
    we define the type, here it’s number because we deal with numbers on which we will do the calculation, and in the sql statement we state the dimensions – the one that we want to divide by the other one, the division sign and the other dimension:
    ${“the field’s name”} / ${“the field’s name”}

We now have one important last thing to do to avoid possible errors since it’s a division operation, and that is to make sure we don’t divide by zero!

  • In that case, we need to use the SQL NULLIF function and add it to our sql statement for the dimension in the denominator in the following way: NULLIF(${“the dimension’s name”}, 0)
    It means if the denominator is zero, treat it like NULL instead.

Don’t forget to click Save Changes when done!

Like before, if we go back to adding a tile to our dashboard, after refreshing we should see in the explore the measure we added under MEASURES.

Now, suppose we want to use this measure and make a visualization with it, like getting the death ratio of each country, if we try to select the country field and our measure, and Run, we would get an error.  

The measure performs its calculation on dimensions so Looker get confused… 
In order to make it work we need to help Looker by adding and selecting the fields that our measure is made of (in this example – Daily New Cases and Daily New Deaths).

  • If we try to Run now, we’ll see that the problem is solved and the data appears.
  • We only want the death ratio per country data to be presented visually so as we learned in the previous chapter, we just need to Hide From Visualization the fields we only added for the query to run, that the measure consists of (in this example, Daily New Cases and Daily New Deaths).

 

Working with SQL in Looker

In some cases you’ll might want or have to use and work with SQL in Looker. 
Let’s take as an example, the tile that we made for the first use case we talked about in this chapter of the sum of active cases per country for the Coronavirus data dashboard. To remind you, we started off by selecting the country dimension and creating the measure custom field of sum for the Active Cases dimension.  

Now, suppose you want to partition this data by Rank, and you can’t or don’t know how to do it with LookML, however you do know how to do it with SQL. In that scenario you can take the generated SQL of the data that is already presented and query it to make a Rank partition field.

  • You can do so by clicking SQL on the Data tab and selecting Open in SQL Runner

In here, we’ve got a SQL IDE where we can query the table we already generated by selecting the dimension and measure of countries and sum of cases.

It will be our subquery from which we will be able to make a rank function on the data.

Here is a simple main query to create a basic rank field for the demonstration. When you are done writing your query, you can see the results by pressing Run

Now that we managed to get the wanted results with SQL – in this example the rank field, if we want to use it in a dashboard we need to import this derived table to our project.
To do so, press the cogwheel button and select Add to Project…

  • Now, select the project you want to add this to – in this example the project for the coronavirus data is called “yuval_project_2”.
  • Since it is a new table – based on the view that we queried from but still a table by itself, it will be added to the project as a new view, so we can give it a name and enter it in the View Name pane.
  • When done, click Add.

Now all you need to do is add the view to your model.

  • First you need to drag the view to be under the views folder.
  • Then, select your project’s model and add the view as a new explore to your project by writing the code line as the following:
    explore: “the view’s name” {}
  • After that save by clicking Save Changes

 

That’s all for this chapter.
Hoping to see you in the next one.

If you have Questions regarding looker, feel free to contact us through the website or via Linkedin

BI

Dashboards and Visualizations

Looker 101 Chapter 3

Dashboards and Visualizations

In this chapter, we are going to build a dashboard based on Looker’s demo product data. We’ll learn how to create a dashboard, what Tiles are, what you can do with them and how to add them to your dashboard. 
In addition, we will make some example graphs while covering many useful features and use cases you can implement in Looker. 
Finally, we will go over some adjustments you can make to your dashboard such as adding filters to it, and scheduling query run times for your graphs to make sure they are presenting up to date data.

Creating a Dashboard

Dashboards are visual displays of  our data.

  • To create one, First, Login and Go to the Looker Homepage.
  • Press on the New button that appears on the top right corner of your screen and Select Dashboard.
  • Give a name to the dashboard, and after that Press Create Dashboard

Adding Tiles and Visualization Use Cases

At first, our dashboard is empty – we haven’t add anything to it yet…

  • Click on Edit Dashboard

This is where you edit your dashboard and add graphs and visual data to it.
In Looker, a dashboard is made of “Tiles”, blocks, that each is a component in the dashboard – a Visualization: Graph, Numerical  data, or a Text Tile: notes, subtitles… 

  • Let’s add a tile – Click on Add, then Select Visualization
  • We will choose a Looker preloaded Explore to work with – you can ask Looker to add it to your looker if you don’t have it.
  • Press on Orders Items and Users.
  • For this example, Give your Tile the title of “Total Customers”.
  • Select the Users view.
  • From the Users list of fields, scroll down to MEASURES and select Count.
  • Press Run (appears on the top right corner of your screen).

You should now get the number of Users.

  • Notice that there are three different sections: Filters,Visualization and Data:
    Filters – the “Filters” section is where you can create filters for your queries.
    Visualization – the “Visualization” section is where you can see how it will be added to your Tile.
    Data –  the “Data” section is where you can see the data in its tabular form under Results, and the actual SQL behind it under SQL.

 

  • Click the Save button on the top right corner of your screen. 

The Tile of total customers is now added to your dashboard.
We want the dashboard to be neat and conveniently designed for our clients – You can improve the design by adding notes and texts to your dashboard .

  • Press the Add button and select Text.
  • Inside the Title and Subtitle panes, copy and enter the following text:
    Title Brand Overview
    Subtitle – What are the high level revenue metrics for this brand?
    Click Save.
  • Now that this Text Tile is added to your dashboard, drag the Tile of total customers below the Text Tile by pressing and holding the dragging button on the top left corner of the Tile:
  • Use the “black triangle” button on the bottom right corner of the Text Tile to widen it and center the text.
  • For each new Tile you create, or change you make in your dashboard make sure to save it right away by pressing the Save button on the top right corner of your screen!
    Looker does not have an auto-save function… 
  • After saving you will be back to the visualization mode so go back to editing mode by clicking on the “three dots menu” button on the top right corner of your screen and selecting Edit dashboard, or by pressing ctrl+shift+e

Let’s add another Numerical data bullet to our dashboard.

  • Press Add and select Visualization
  • Select the Orders Items and Users explore.

    This time the number we want to display is the average price at which a item is sold

  • Click on Order Items

  • From the list, select the measure Average Sale Price and press Run

You should now get the average sale price.

  • Don’t forget to give your Tile a title from which it will be possible for anyone to understand what the Tile is about – in this example name it “Average Order Value”.

Now, let’s add a graph to our dashboard- 
We are going to make a graph of two trend lines – one for the average sell price and one for the total sell price by dates, so there will be two different Y-axes, and the X-axis will be based on Dates.

  • Press Add and select Visualization
  • Once again Choose Orders Items and Users, then Press Order Items, then from Created Date, Select Date.

Now for the vertical axes – one will be the amount of items sold, and one will be the average sale price.

  • Scroll down to MEASURES and Select Average Sale Price.
  • For our second Y-axis Click on Total Sale Price.
  • Notice that you can drag and switch position between fields to change the visual order, for example, Total Sale Price will now be the left Y-axis.
  • Don’t forget to give the graph a suiting title that expresses what it is about .
  • Press Run, and you should get the following graph:



We got a graph with a horizontal axis of dates as wanted.
However, it’s inconvenient for the user – we got a single Y-axis of prices in the scale of the Total Sale Price field prices caused by the fact that both Total Sale and Average Sale are fields of prices and there are big gaps between the prices of the fields as well so you can barely see the fluctuations in the Average Sale line…

We need to manually change the graph so it will have separated Y-axes for each field of prices.

  • To do so, on the visualization tab, press the Edit button and select Y.
  • You can see that both of the fields are under Left 1, so simply drag one of them below the Right Axes section.

Now that the graph has two y-axes as we wanted, we can make it even more convenient by removing the unnecessary decimal format of the numbers of the y-axes, since we are not dealing with percent in our graph.

  • You can change the format by scrolling down to Y Axis Format and entering the format you want. In our case enter: “$##,###” 
  • You need to edit each of the axes separately – we applied this format change only on our left one, so under the Configure Axes select Right 1 and do the same to change its format as well.

The graph now looks much better than it did at the beginning and meets our needs!

  • Click Save, then move and/or resize the Tiles to design your dashboard if you’d like and then click Save again – this time to save the dashboard itself and come back to editing mode (Dashboard actions button->Edit dashboard)

For the next example,  let’s say we are a clothing store and we want to see how much is the total sell price of each product categories, for each department – men and women. 
For that, we will make another graph, of total sell price by category and department.

  • Just like before, press Add and select Visualization, then choose Orders Items and Users. (Give the tile a suiting title such as “Most Popular Categories”) 
  • From Order Items select Total Sale Price, and from Products select Category  .                         
  • Pressing Run will get us the following bar chart.
    * Notice that you are able to change the visualization type by choosing a different kind on the visualization tab – stick with the bar format for this graph, though.

In order to make the total price be displayed for each department, we need to add a secondary dimension.

  • From Products press the Pivot data button of the Department dimension to make each value in the dimension become a column in the graph.
  • Press Run to see the result.

We got the desired graph, however we can make it look even better.

  • Click the visualization section’s Edit button, then from the Plot tab, under Series Positioning, select Stacked.
  • The graph now presents the total price of each category divided by men and women, ordered descending by the total sale price of men for each category. In order to make it more convenient and organized, Let’s make it ordered by the general total sale price of each category. Doing so requires us to add another field that will be this general total – a scheme of the total amount for women and men.
  • Check the Row Totals box that appears on the Data section and press Run.

 

A Total field is now added under the Data section with the scheme of total sale price for men and women values.

  • Now we can make the graph ordered descending by this total value by clicking under its name on the Order Items Total Sale Price field in the Data section, so an arrow pointing down will appear next to it.
  • Our graph is ready – make sure to save it and then save the dashboard itself.
    Afterwards, go back to editing mode (Edit dashboard).

For our next use case we’ll make a line chart and add a filter to it.

  • As you must know how to do it by now, add a new Visualization Tile (Add->Visualization), but this time select a different explore – Share of Wallet Analysis.

Give it the title “Brand Share of Wallet over Customer Lifetime” – that is what our graph is going to be about.

Let’s say we only want to present data of users that signed up a year and a half or less ago. We can do so by using filters.

  • From Orders press the “Filter by field” button of the Months Since Signup dimension.

  • Set the filter to “is less than or equal to” and enter the number of months that the condition is relative to – 18 in our case.
    * After you filtered it, click the dimension itself to make sure it’s added as a field under the Data section.

      Now let’s add the measures – 

  • From the Share of Wallet (Brand Level) view, select Brand Share of Wallet Within Company and Total Sales – This Brand 
  • Press Run.
  • Change the type of the graph to Line on the Visualization tab.

* Don’t forget to save the tile and then the dashboard itself when you are done. Then go back to editing mode.

  • For our last use case, once again add a visualization tile and this time select the Affinity Analysis explore .
  • Give it the title of “Purchasers of This Brand Also Bought (Brand Affinity)”.
    * Notice that this explore has a required filter set by default for the
    Affinity Product B ID dimension – whoever wrote the LookML code for this explore defined it that way.
  •  Add another filter to the tile for the Affinity Score (by Order Basket) measure that is under the Affinity field, by clicking on the “Filter by field” button next to it, and just like the other filter set it to “is not null”.

 

  • Now, click on this measure itself to add it as a parameter to the graph and additionally, select the Affinity Score (by User History) and Combined Affinity measures as well.
  • Select the Brand dimension under the Product A Details and Product B Details fields.
  • Press Run, and change the visualization type to Table.

Let’s say we don’t want to present the Product A brand in our visual data, but we still want to make calculations with it for other fields like measures that depend on this field, so it will still remain in the Results tab of the Data section and only visually not be shown.

  • To do so, press the cogwheel button next to the “Product A Details Brand” fields under the Data section, and select Hide from visualization.

Order by the Combined Affinity measure by clicking it, under the Data section.

We only added it so we can order the data by it, however we don’t want it as a column in our visual table in the tile itself.

Just like we learned how to do before, press the cogwheel button next to this measure and click on Hide from visualization.

You can add colors beyond the numbers to present them, to the Affinity Score (by User History) column in the table – just like the  Affinity Score (by Order Basket) column has.

  • Press the Edit button on the visualization toolbar, select Series, click on the desired column name (Affinity Affinity Score (by User History)), and turn on Cell Visualization

You can choose the set of colors you’d like to use and even custom an original set.

To sum it up, in the table we got, in the visual layer we don’t present the Brand of Product A details, and the Combined Affinity measure, but we do base our data and calculations on those fields and that’s why we add them to the Data section’s Results – the measures are grouped by both brands A and B, and the data shown is ordered by the Combined Affinity measure.



  • Save the tile.
  • Play around with the tiles’ positions and sizes to design the dashboard, and don’t forget to save it when you are done.

Dashboard Filters

Dashboard filters let us narrow a dashboard’s results to only the we are interested in – affecting all 
the tiles (it is possible to apply the filter to only one dashboard tile).

  • To add one, in editing mode, click Filters and then select Add Filter.

You will now see the list of Explores that are used in your dashboard.   
Let’s make a Generic date-based filter. 

  • In the search pane, enter “date”.
  • From Orders, Items and Users, under Order Items, under Created Date select Created Date.
  • Change the filter’s title to “Date”.
    Now we need to set a default dates range that the data will be filtered on – later on a viewer of our dashboard will be able to temporarily change the range value for his needs (without affecting the dashboard for other viewers and users).
  • For Control select Advanced – it will give us and the future viewers of the dashboard lots of options and freedom to filter.
  • For our example, under Configure Default Value select is in the last, enter 90, and select days – only the data from the last three months will be queried into our tiles.



Since we are basing our filter on a field related to the explore of “Orders, Items and Users”, and the last two tiles we created (Brand Share of Wallet over Customer Lifetime and Purchasers of This Brand Also Bought (Brand Affinity)) has nothing to do with this explore as they are based on different explores, in the current state the filter won’t affect those tiles – we have to manually make adjustments for them.
For all of the other tiles the filter will work, though – they are all based on the same explore from where the field that the filter is based on is.  

Go to Tiles to Update, and from the list of our visualization tiles, for the Brand Share of Wallet over Customer Lifetime tile, change the Field to Filter –   under Share of Wallet Analysis, below Order Items, select Created Date.

* Make sure that it’s the same field and it has the same data type – of date and not string for example, otherwise, there will be errors.

  • For the “Purchasers of This Brand Also Bought (Brand Affinity)” tile, there isn’t a Created Date filed in its explore so in the Field to filter select Do Not Filter.
  • The filter is ready – click Add.
  • Press on Update and save the dashboard.

You can change the filter date range even in visualization mode by pressing and editing it and then clicking on the update button – See how the data changes according to the changes you make to the filter.

Dynamic Dashboard

Looker allows you to schedule your dashboard tiles data to be updated.
*Notice that any such update means querying the data, so take into consideration the cost matter before applying such thing.

  • In order to schedule your dashboard to update, go to editing mode (Edit Dashboard) and press on Settings.
  • In General, turn on Automatically refresh dashboard, and then enter the period of time you want for it to update. 
  • You are able to separately choose different period of times for each tile in the dashboard to update by changing the Refresh frequency for the tiles you want.
  • If you want to apply this, press Save, otherwise just cancel.

This wraps up this chapter.
Hoping to see you in the next one- How to manually add measures and make your own SQL queries in Looker.

If you have Questions regarding looker, feel free to contact us through the website or via Linkedin