CDH启用sentry

CDH集群启用sentry步骤和测试。

CDH启用sentry

CDH中添加sentry服务后,按照Configuring the Sentry Service一步步进行来配置sentry服务。

Before Enabling the Sentry Service

  1. 设置hive.metastore.warehouse.dir配置项(默认路径是/user/hive/warehouse)的权限和owner。
    1
    2
    $ hdfs dfs -chmod -R 771 /user/hive/warehouse
    $ hdfs dfs -chown -R hive:hive /user/hive/warehouse

如果已经启用了kerberos,需要kinit -k -t hdfs.keytab hdfs

  1. Disable impersonation for HiveServer2
    配置项: hive – HiveServer2 Enable Impersonation
  2. Enable the Hive user to submit YARN jobs
    Ensure the Allowed System Users property includes the hive user. If not, add hive.
    配置项: yarn – allowed.system.users

Important: Ensure you have unchecked the Enable Sentry Authorization using Policy Files configuration property for both Hive and Impala under the Policy File Based Sentry category before you proceed.

Enabling the Sentry Service for Hive

  1. 修改hive配置项Sentry Service,选择”Sentry”
  2. 取消选中hive.server2.enable.impersonation

Enabling the Sentry Service for Impala

修改impala配置项Sentry Service,选择”Sentry”

Enabling the Sentry Service for Hue

修改hue配置项Sentry Service,选择”Sentry”

Important:

  1. When Sentry is enabled, you must use Beeline to execute Hive queries. Hive CLI is not supported with Sentry and must be disabled as described here.
  2. When Sentry is enabled, a user with no privileges on a database will not be allowed to connect to HiveServer2. This is because the use command is now executed as part of the connection to HiveServer2, which is why the connection fails. See HIVE-4256.

配置hive with sentry

http://www.cloudera.com/documentation/enterprise/5-4-x/topics/sg_hive_sql.html

如果启用了kerbreos

启用kerberos后,使用下面命令进入beeline进行设置

1
2
$ kinit -k -t hive.keytab hive
$ beeline -u "jdbc:hive2://vlnx107011:10000/default;principal=hive/vlnx107011@HADOOP.COM"

如果未启用kerberos

在hive配置sentry-site.xml 的 Hive 服务高级配置代码段(安全阀)中添加

1
2
3
4
<property>
<name>sentry.hive.testing.mode</name>
<value>true</value>
</property>

可以使用beeline -u "jdbc:hive2://vlnx107011:10000/" -n <admin_user>进行设置,其中admin用户在sentry的sentry.service.admin.group中配置。

Important: 用户和组使用的是Linux机器上的用户和组,而角色必须自己创建。

配置HDFS with sentry

参考http://www.cloudera.com/documentation/enterprise/5-4-x/topics/sg_hdfs_sentry_sync.html

关于hdfs acl,参考http://www.cloudera.com/documentation/enterprise/5-4-x/topics/cdh_sg_hdfs_ext_acls.html

  1. hdfs acl
  2. 启用Sentry同步
  3. 检查HDFS权限,dfs.permissions
  4. 设置Sentry同步路径前缀, sentry.hdfs.integration.path.prefixes,可以多个。

Sentry-HDFS authorization is focused on Hive warehouse data - that is, any data that is part of a table in Hive or Impala. The real objective of this integration is to expand the same authorization checks to Hive warehouse data being accessed from any other components such as Pig, MapReduce or Spark. At this point, this feature does not replace HDFS ACLs. Tables that are not associated with Sentry will retain their old ACLs.

存在哪些问题:

  1. sentry.hdfs.integration.path.prefixes更改需要重启hdfs
  2. 启用后hdfs acl失效
  3. hdfs uri不能自动统一成标准格式。/facishare-data/, hdfs:///facishare-data/, hdfs://nameservice1/facishare-data/, hdfs://nameservice1:8020/facishare-data/在sentry的理解中是不同的路径。

hue中进行sentry配置

http://gethue.com/apache-sentry-made-easy-with-the-new-hue-security-app/#howto
在ldap中新建了服务账号,用于在hue中对sentry进行设置

  1. 在所有机器上同步此账号和组
  2. 在sentry中将此账号组加入到管理员组sentry.service.admin.group
  3. hue中新建hive组,并将此账号加入到hive组

附录:

hive权限

操作 解释
ALL 所有权限
ALTER 允许修改元数据(modify metadata data of object)—表信息数据
UPDATE 允许修改物理数据(modify physical data of object)—实际数据
CREATE 允许进行Create操作
DROP 允许进行DROP操作
INDEX 允许建索引(目前还没有实现)
LOCK 当出现并发的使用允许用户进行LOCK和UNLOCK操作
SELECT 允许用户进行SELECT操作
SHOW_DATABASE 允许用户查看可用的数据库

Hive SQL Syntax for Use with Sentry

创建和删除角色

  1. 创建角色: create role ROLE_NAME
  2. 删除角色: droop role ROLE_NAME

角色的授权和撤销

角色的授权(GRANT)就是给角色授予创建表、查询表等操作,撤销(REVOKE)反之。语法如下:

1
2
GRANT ROLE role_name [, role_name] TO GROUP <groupName> [,GROUP <groupName>]
REVOKE ROLE role_name [, role_name] FROM GROUP <groupName> [,GROUP <groupName>]

权限的授予和撤销

1
2
GRANT <PRIVILEGE> [, <PRIVILEGE> ] ON <OBJECT> <object_name> TO ROLE <roleName> [,ROLE <roleName>]
REVOKE <PRIVILEGE> [, <PRIVILEGE> ] ON <OBJECT> <object_name> FROM ROLE <roleName> [,ROLE <roleName>]

查看角色/组权限

1
2
3
4
5
SHOW ROLES;
SHOW CURRENT ROLES;
SHOW ROLE GRANT GROUP <groupName>;
SHOW GRANT ROLE <roleName>;
SHOW GRANT ROLE <roleName> on OBJECT <objectName>;

示例:

  • 把role_test1角色授权给test组: grant role role_test1 to group test
  • 查看test组被授权的角色: show role grant group test
  • 取消test组的role_test1角色: revoke role role_test1 from group test
  • Grant privileges to analyst_role:

    1
    2
    3
    4
    CREATE ROLE analyst_role;
    GRANT ALL ON DATABASE analyst1 TO ROLE analyst_role;
    GRANT SELECT ON DATABASE jranalyst1 TO ROLE analyst_role;
    GRANT ALL ON URI 'hdfs://ha-nn-uri/landing/analyst1' TO ROLE analyst_role;
  • Grant privileges to junior_analyst_role:

    1
    2
    3
    4
    CREATE ROLE junior_analyst_role;
    GRANT ALL ON DATABASE jranalyst1 TO ROLE junior_analyst_role;
    GRANT ALL ON URI 'hdfs://ha-nn-uri/landing/jranalyst1' TO ROLE junior_analyst_role;
    grant all on database test to role admin_role with grant option;
  • Grant privileges to admin_role:

    1
    2
    CREATE ROLE admin_role
    GRANT ALL ON SERVER server TO ROLE admin_role;
  • Grant roles to groups:

    1
    2
    3
    GRANT ROLE admin_role TO GROUP admin;
    GRANT ROLE analyst_role TO GROUP analyst;
    GRANT ROLE jranalyst_role TO GROUP jranalyst;

参考:

  1. Authorization Privilege Model for Hive and Impala: http://www.cloudera.com/documentation/enterprise/5-4-x/topics/cm_sg_sentry_service.html#concept_cx4_sw2_q4_unique_1
  2. The Sentry Service: http://www.cloudera.com/documentation/enterprise/5-4-x/topics/cm_sg_sentry_service.html
  3. Apache Sentry made easy with the new Hue Security App: http://gethue.com/apache-sentry-made-easy-with-the-new-hue-security-app/#howto
  4. What is missing in Apache Sentry (incubating)?: http://getindata.com/blog/post/what-is-missing-in-apache-sentry-incubating/
  5. Apache Sentry Tutorial: https://cwiki.apache.org/confluence/display/SENTRY/Sentry+Tutorial