AWS EMR, zeppelin

EMR Zeppelin Security

Apache Shiro is part of the installation of EMR with the options below:

  • Basic authentication (via Apache SHIRO): user management (user,pass,groups), even LDAP

  • notebook permissions management: read/write/share

  • Data source authorization (e.g 3rd party DB):


Adding HTTPS/SSL to the EMR Zeppelin GUI (3 options)


Walkthrough to add https to Zeppelin:

1) Generating PKCS1 keystore file : Log into the master instance of the EMR cluster and run the following commands:

openssl req -newkey rsa:2048 -nodes -keyout key.pem -x509 -days 365 -out certificate.pem

openssl x509 -text -noout -in certificate.pem

openssl pkcs12 -inkey key.pem -in certificate.pem -export -out certificate.p12

openssl pkcs12 -in certificate.p12 -noout -info

Please enter the public DNS name of the master node when asked for hostname. The above commands would create a file named: /home/hadoop/certificate.p12 This file is your certificate


2) Change the below properties in the zeppelin-site.xml file located at /etc/zeppelin/conf.dist/zeppelin-site.xml (If not present, copy the /etc/zeppelin/conf.dist/zeppelin-site.xml.template file and rename)



Should SSL be used by the servers?



Path to keystore relative to Zeppelin configuration directory



The format of the given keystore (e.g. JKS or PKCS12)



Keystore password.

Can be obfuscated by the Jetty Password tool zeppelin.server.ssl.port 8445 Server ssl port. (used when ssl property is set to true) 3)

Restart Zeppelin :sudo stop zeppelin

sudo start zeppelin

4) You would be able to access Zeppelin over https on port 8445 : https://:8445/#/

User management Via Shiro

Now in order to manage groups/roles, you could create the groups/roles under the “[roles]” section in the “shiro.ini” file. For example, I could have a set of groups like:



    admin = *

    readonly = *

    poweruser = *

    scientist = *

    engineer = *

Then in the “[users]” sections, it could be looking like the below:


    admin = password>, admin

    user1 = password>, scientist, poweruser

    user2 = password>, engineer, poweruser

    user3 = password>, readonly



For example, the above means that:


    – user “admin” is in “admin” group;

    – user “user1”  is in “poweruser” and “scientist” group

    – etc.


    Owners  admin

Writers scientist,engineer,poweruser

Readers readonly


Once the groups/roles are created, the authorization setting will be similar to what described in . For instance, when in a notebook permission page, you can put the group name, instead of the individual users.

good read: recommendation from horton works:


Need to learn more about aws big data (demystified)?



I put a lot of thoughts into these blogs, so I could share the information in a clear and useful way. If you have any comments, thoughts, questions, or you need someone to consult with, feel free to contact me:

Leave a Reply