Blog

architecture, Big Query, cost reduction, GCP Big Data Demystified, superQuery

80% Cost Reduction in Google Cloud BigQuery

The second in series of lectures GCP Big Data Demystified. In this lecture I will share with how I saved 80% of BigQuery monthly billing of investing.com. Lectures slides:

Videos from the meetup:

Link to previous lecture GCP Big Data Demystified #1

——————————————————————————————————————————

I put a lot of thoughts into these blogs, so I could share the information in a clear and useful way. If you have any comments, thoughts, questions, or you need someone to consult with, feel free to contact me:

https://www.linkedin.com/in/omid-vahdaty/

BI

Data Analysis from raw data to dashboarding

Data Analysis from raw data to dashboarding

Lecturer: David Zenesh 2.2.2021

In the world of big data, data visualization tools and technologies are essential to analyze massive amounts of information and make data-driven decisions. Today’s culture is visual, including everything from art and advertisements to TV and movies, and our eyes are drawn to colors and patterns. Our interaction with data should reflect this reality.

Video


——————————————————————————————————————————
I put a lot of thoughts into these blogs, so I could share the information in a clear and useful way.
If you have any comments, thoughts, questions, or you need someone to consult with,

feel free to contact me via LinkedIn – Omid Vahdaty:

NoSQL

Extreme Data Streaming using Reactive and Aerospike NoSQL

Extreme Data Streaming using Reactive and Aerospike NoSQL

Lecturer: Rony Keren 6.1.2020

How Reactive Programming allows implementing massive data streaming and event pooling by using optimal, limited number of threads. Combined with rapid disk access and IO, it reduces the usage of expensive cache, simplify and speedup access to data, mostly relevant for Big Data.

Video and slides

Next-Generation Real-time NoSQL database

Lecturer: Zohar Elkayam 6.1.2020

How Aerospike distributed NoSQL Database, with its patented Hybrid Memory Architecture TM can be used for building sub-millisecond high throughput real-time database applications to handle any data at any scale, in the fastest and easiest way and in the lowest possible costs.

Video and slides


——————————————————————————————————————————
I put a lot of thoughts into these blogs, so I could share the information in a clear and useful way.
If you have any comments, thoughts, questions, or you need someone to consult with,

feel free to contact me via LinkedIn – Omid Vahdaty:

BI

How to improve Tableau Dashboard Performance

How to improve Tableau dashboard performance

5 Tips on how to optimize your dashboard performance in Tableau

Author: Delphine Elfassi 6.1.2021

  1. Avoid calculated fields & custom queries.
  2. Extract your data
  3. Add filters to context.
  4. Avoid quick filters.
  5. Customize your filters.

1 | Avoid calculated fields and Custom Queries

If possible, try to avoid calculated fields and custom queries – You should perform calculations in your database.

2 | Extract Your Data

Use the Extract connection instead of the live connection to your data source.
For More Details : https://help.tableau.com/current/pro/desktop/en-us/extracting_data.htm

3 | Choose “Add to Context” to your Filters

By default, all filters are calculated independently. That means that for each filter Tableau runs through the entire data source and figures out all the rows that pass each filter individually and then returns the union of the results. For example, say you’ve filtered a view to only show the Western region and you’ve added another filter to only show sales from the first quarter of 2009. Tableau first looks at all the records and pulls out all the ones from the Western region. Then it goes back through all of the records again and pulls out all sales from Q1 of 2009. Finally the result is the union of both these independent filters.

Sometimes you’ll want to first run one filter and then run other filters just on the results of the first one. These types of filters are called Context Filters. When a filter is added to the context it is the independent filter and all other filters are only computed just on the results of the context.

NOTE: If your table has complex aggregations and is more than a simple view table, you should be very careful using this option!

For More Details :
https://help.tableau.com/current/pro/desktop/en-us/filtering_context.htm

Go to Filters Area → Right Click on the filter field → Add to Context.

 

4 | Avoid Quick Filters

Use hierarchical / Cascading quick filters only If your filters do not contain too many values.
The disadvantage of quick filters is that they will increase the loading time, that’s why it is recommended to use only necessary quick filters and do not add more quick filters in your worksheet or Dashboard.

 Go to the show filter → Right Click → All values in Database

5 | Customize your filter

Customize your filter and choose the “Apply” button.

Right click on the filter show → Customize → “Show Apply Button”


——————————————————————————————————————————
I put a lot of thoughts into these blogs, so I could share the information in a clear and useful way.
If you have any comments, thoughts, questions, or you need someone to consult with,

feel free to contact me via LinkedIn:

AWS athena

AWS Athena Worst Practices Demystified

AWS Athena Worst Practices Demystified

Author: Omid Vahdaty 7.12.2020

what are the top 3 worst use cases you can use AWS Athena and why?
what would be a best practice for such use case?
Stability and performance issues using AWS Athena and what to do to avoid them.

Video and slides


——————————————————————————————————————————
I put a lot of thoughts into these blogs, so I could share the information in a clear and useful way.
If you have any comments, thoughts, questions, or you need someone to consult with,

feel free to contact me via LinkedIn:

Data Science

Superhero team-up! Data scientists and Data engineers – stronger together

Superhero team-up! Data scientists and Data engineers - stronger together

Authors: Erez Schandier and Dr. Ariel Biller 2.12.2020

As Data superheroes, our job is to be alert at all times for any problems.
However, an individual’s role in the organization has her meeting data in different stages of its lifecycle.
What happens to the data before or after she is done with it?
This webinar examines the touchpoints between key figures: The Data Engineer and Data Scientist. We show how to decrease possible frictions and eliminate duplicate work by using a unified platform for managing the lifecycle of Machine and Deep learning models.

Video and slides

Hebrew

English


——————————————————————————————————————————
I put a lot of thoughts into these blogs, so I could share the information in a clear and useful way.
If you have any comments, thoughts, questions, or you need someone to consult with,

feel free to contact me via LinkedIn- Omid Vahdaty: