While setting up Apache Hive, HiveServer2 and Beeline (using vanilla packages instead of some kind of prepackaged Hadoop distribution), I struggled with some permission/user related problems. The error message I got stuck with was something like this:
org.apache.hadoop.security.authorize.AuthorizationException User: hive is not allowed to impersonate johndoe
While googling around for this, I found some parts of the puzzle, but I didn't really encounter a explanation that connected all the necessary dots to solve the problem for my use case:
- using "simple" Hadoop Authentication, with standard Linux users
- HDFS namenode and datanodes are running as user
hdfson a bunch of Hadoop cluster machines, let's call them
- YARN resource manager and node manager re running a user
yarnon these same machines
- HiveServer2 is running as user
hiveon a separate machine, let's call it
- A normal user, e.g.
johndoe, also working from this separate machine
work01, wants to use Beeline to run a Hive query
After quite a bit of rabbit hole crawling I got it working without the "impersonation" error above, as follows:
hiveuser to be a proxy user so that HiveServer2 (which runs as user
hive) can impersonate other users (e.g.
johndoe). I added this to Hadoop config
<property> <name>hadoop.proxyuser.hive.hosts</name> <value>*</value> </property> <property> <name>hadoop.proxyuser.hive.groups</name> <value>*</value> </property>
The proxy user feature is only available for superusers, so also make sure this
hiveuser belongs to the Linux user group with the name of the HDFS superuser group (usually
- Make sure the linux user
hiveexists and belongs to this superuser group, not only on
work01, but also on the
hadoopXXmachines. Otherwise the HDFS namenode and YARN resource manager won't handle the
- Restart the HDFS namenode and YARN resource manager services,
so the new configs in
core-site.xmlare picked up.
- Restart the HiveServer2 service
- If you are running the HiveServer2 service in a
non-managed/non-daemon way from an interactive shell,
it might be necessary to start a new shell/session before restarting the service
so that user
hive's supergroup membership is picked up as intended.