The problems with querying google BigQuery:
- The table is sharded not partitioned – this means each table has it owen table name e.g ga_sessions_20190202
- The GA sharded table also has some temporary data on a table called ga_sessions_intraday_20190505
- so when you select * from ga_sessions_* you will also get some unexpected ga_sessions_intraday_* shards
Below is example of querying the last 4 days in any GA_sessions table
select * frommyProject.MyDataset.ga_sessions_*
where TABLE_SUFFIX BETWEEN FORMAT_DATE("%Y%m%d", DATE_SUB(CURRENT_DATE(), INTERVAL 4 DAY)) AND FORMAT_DATE("%Y%m%d", DATE_SUB(CURRENT_DATE(), INTERVAL 1 DAY)) OR (_TABLE_SUFFIX = CONCAT('intraday',CAST(FORMAT_DATE("%Y%m%d", DATE_SUB(CURRENT_DATE(), INTERVAL 0 DAY)) AS string) ) OR TABLE_SUFFIX = CONCAT('intraday',CAST(FORMAT_DATE("%Y%m%d", DATE_SUB(CURRENT_DATE(), INTERVAL 1 DAY)) AS string) ))
——————————————————————————————————————————
I put a lot of thoughts into these blogs, so I could share the information in a clear and useful way. If you have any comments, thoughts, questions, or you need someone to consult with, feel free to contact me: