To secure the thrift connection you can enable the ssl encryption and restart the hive-server2 and thrift service on emr master instance.
Following are the list of step to do so:
1. Create the self-signed certificate and add it to a keystore file using:
$ keytool -genkey -alias public-dnshostname -keyalg RSA -keystore keystore.jks -keysize 2048
Make sure the name used in the self signed certificate matches the hostname (use public dns name since you are connecting from outside of VPC) where Thrift server will run.
2. List the keystore entries to verify that the certificate was added. Note that a keystore can contain multiple such certificates:
$ keytool -list -keystore keystore.jks
3. Export this certificate from keystore.jks to a certificate file:
$ keytool -export -alias public-dnshostname -file example.com.crt -keystore keystore.jks
4. Add this certificate to the client’s truststore to establish trust from where you want to connect. since you are connecting from local instance, copy the certificate “example.com.crt” to your local instance from emr master node and then import it.
$keytool -import -trustcacerts -alias public-dnshostname -file example.com.crt -keystore truststore.jks
5. Verify that the certificate exists in truststore.jks:
$keytool -list -keystore truststore.jks
Once the certificate is imported, make the following changes in /etc/hive/conf/hive-xml site.
+++
hive.server2.transport.mode : http
hive.server2.use.SSL : true
hive.server2.keystore.path : path/to/your/keystore/jks
hive.server2.keystore.password : “keystorepassword”
+++
Restart hive-server2 and thrift server
$ sudo stop hive-server2 && sudo start hive-server2
$ sudo -u spark /usr/lib/spark/sbin/stop-thriftserver.sh && sudo -u spark /usr/lib/spark/sbin/start-thriftserver.sh
check whether service started successfully and also verify that master instance is listening on port 10001
+++
$ sudo netstat -tulpan |grep 10001
tcp 0 0 :::10001 :::* LISTEN 12494/java
+++
Once service is started then you can make connection using jdbc driver as below
Need to learn more about aws big data (demystified)?
- Contact me via linked in Omid Vahdaty
- website: https://amazon-aws-big-data-demystified.ninja/
- Join our meetup, FB group and youtube channel
- Join our meetup : https://www.meetup.com/AWS-Big-Data-Demystified/
- Join our facebook group https://www.facebook.com/groups/amazon.aws.big.data.demystified/
- subscribe to our youtube channel https://www.youtube.com/channel/UCzeGqhZIWU-hIDczWa8GtgQ?view_as=subscriber
——————————————————————————————————————————
I put a lot of thoughts into these blogs, so I could share the information in a clear and useful way. If you have any comments, thoughts, questions, or you need someone to consult with, feel free to contact me:
https://www.linkedin.com/in/omid-vahdaty/