How to transform data (TXT, CSV, TSV, JSON) into Parquet, Which technology should we use to model the data? EMR, Athena, Redshift, Spectrum, Glue, Spark, or SparkSQL? How to handle streaming? How to manage costs? Performance tips, Security tip and cloud best practices tips