architecture, AWS athena, AWS EMR, cost reduction

When should we use EMR and When should we use Redshift? EMR VS Redshift

Use Redshift when Traditional data warehouse When you need the data relatively hot for¬†analytics¬†such as BI When there is no data engineering team When your queries require joins When you need a cluster 24X7 When you data type are simple, i.e not Arrays, or Structs When data has no nested jsons When you have petabyte… Continue reading When should we use EMR and When should we use Redshift? EMR VS Redshift

AWS Athena Blogs

Converting TPCH data from row based to columnar Via Hive or SparkSQL and run ad hoc queries via Athena on columnar data Big Data in 200KM/h | Big Data Demystified How to ignore quoted fields inside a CSV via AWS Athena? AWS Big Data Demystified #1.2 | Big Data architecture lessons learned Serverless Data Pipelines… Continue reading AWS Athena Blogs

Architecture Blogs

16 Tips to reduce costs on AWS SQL Athena DFP Data Transfer Files Use Case | BigQuery 93% Cost Reduction demystified 80% Cost Reduction in Google Cloud BigQuery | Tips and Tricks | Big Query Demystified | GCP Big Data Demystified #2 Big Data in 200KM/h | Big Data Demystified AWS Big Data Demystified #1.2… Continue reading Architecture Blogs

AWS EMR Blogs

200KM/h overview on Big Data in AWS | Part 1 200KM/h overview on Big Data in AWS | Part 2 Cherry pick source files in Hive external table example AWS EMR Presto Demystified | Everything you wanted to know about Presto Questions and answers on AWS EMR Jupiter How to work with maximize resource allocation… Continue reading AWS EMR Blogs