Load CSV from AWS S3 into AWS RDS Aurora

Author: Omid Vahdaty 23.04.2019​

If you want to connect to your Aurora  instance from your EC2, this is your blog.

In order to achieve this, please make sure that the service is running using the following command: 

sudo service mysqld status

Alternatively, install MySQL client directly from the MySQL repositories (https://dev.mysql.com/downloads/). Please note that these are MySQL community repositories and are not maintained/supported by AWS.  
https://dev.mysql.com/doc/refman/5.7/en/linux-installation-yum-repo.html

Once the service is running you can use the following connection string:

mysql -h Cluster end point -P 3306 -u Your user -p 

Please, follow the link for further information while Connecting to an Amazon Aurora MySQL DB Cluster: 
Shortcut to Install MySQL CLI:

sudo yum install mysql

 Connect to RDS, after confirming your Security group allows access from your ec2 to your RDS, example:

mysql -u MyUser -p myPassword -h MyClusterName

Create Role to allow aurora access to s3:

Set your cluster to use the role (with access to s3)

Assuming your DB is on private LAN, Setup VPC endpoint to s3 (always a good practice to setup an endpoint anyway):

Add to Aurora parameter group – aurora_load_from_s3_role- the ARN of the role with s3 access, and reboot the Auroroa DB.

grant your user s3 access via:

GRANT LOAD FROM S3 ON *.* TO ‘user‘@’domain-or-ip-address

Create a table on MySQL , example:

use myDatabase;

create table engine_triggers(
today_date date,
event_prob double
);

Load into examples (ignoring header or using quoted fields_ :

LOAD DATA FROM S3 ‘s3://bucket/sample_triggers.csv’ INTO TABLE engine_triggers fields terminated by ‘,’ LINES TERMINATED BY ‘\n’ IGNORE 1 lines;

LOAD DATA FROM S3 ‘s3://bucket/sample_triggers.csv’ INTO TABLE engine_triggers   fields TERMINATED BY ‘,’ ENCLOSED BY ‘”‘ ESCAPED BY ‘”‘ ;

Ignoring header and using quoted fields

LOAD DATA FROM S3 ‘s3://bucket/sample_triggers.csv’ 
INTO TABLE engine_triggers
fields TERMINATED BY ‘,’
ENCLOSED BY ‘”‘ 
LINES TERMINATED BY ‘\n’
IGNORE 1 lines;

 

You can automate it via crontab hourly. It runs as follows (notice the extra escape char before \n and inside enclosed by”:

0 * * * * mysql -u User -pPassword -hClusterDomainName -e “use myDatabase; truncate myDatabse.engine_triggers; LOAD DATA FROM S3 ‘s3://bucket/file.csv’ INTO TABLE engine_triggers   fields TERMINATED BY ‘,’ ENCLOSED BY ‘\”‘ LINES TERMINATED BY ‘\\n’ IGNORE 1 lines”

Detailed manual:

https://docs.aws.amazon.com/AmazonRDS/latest/AuroraUserGuide/AuroraMySQL.Integrating.LoadFromS3.html

Some documentation about MySQL data types:

https://www.w3schools.com/sql/sql_datatypes.asp

——————————————————————————————————————————
I put a lot of thoughts into these blogs, so I could share the information in a clear and useful way.
If you have any comments, thoughts, questions, or you need someone to consult with,

feel free to contact me via LinkedIn:

Leave a Reply