This blog will be update from time to time. this is a "cut the bullshit give me what i need blog" load CSV to BigQuery , skip header, allow quoted new lines, truncate before loading example
bq --location US load --source_format CSV --replace=true --skip_leading_rows 1 --allow_quoted_newlines --quote "" MyDataset.myTable gs://myBucket/*
Load to BigQuery with schema definition example
bq --location=us load --replace=true --autodetect --skip_leading_rows 1 --source_format=CSV myProject:DATA_LAKE.tablr gs://bucket/* col3:STRING,col2:STRING,col1:STRING
Load JSON to BigQuery example
bq --location=us load --replace=true --autodetect --source_format=NEWLINE_DELIMITED_JSON myProject:DATA_LAKE.tablr gs://bucket/*
Load[!!!] to BigQuery from GCS in Hive partition format:
bq load --source_format=NEWLINE_DELIMITED_JSON --hive_partitioning_mode=AUTO --hive_partitioning_source_uri_prefix=gs://airflow-data/ --autodetect DL_TEMP.t2 gs://data/*
Load to BQ with hive partition and json schema (in addition the hive columns)
bq load --source_format=NEWLINE_DELIMITED_JSON --hive_partitioning_mode=AUTO --hive_partitioning_source_uri_prefix=gs://datalake/price/ --autodetect nexite-qa:DL_TEMP.t10 gs://datalake/price/dt=1/branch=1/ts=1* SKU:string,price:string,discount:string,currency:string
——————————————————————————————————————————
I put a lot of thoughts into these blogs, so I could share the information in a clear and useful way. If you have any comments, thoughts, questions, or you need someone to consult with, feel free to contact me: