• <i id='UNgue'><tr id='UNgue'><dt id='UNgue'><q id='UNgue'><span id='UNgue'><b id='UNgue'><form id='UNgue'><ins id='UNgue'></ins><ul id='UNgue'></ul><sub id='UNgue'></sub></form><legend id='UNgue'></legend><bdo id='UNgue'><pre id='UNgue'><center id='UNgue'></center></pre></bdo></b><th id='UNgue'></th></span></q></dt></tr></i><div id='UNgue'><tfoot id='UNgue'></tfoot><dl id='UNgue'><fieldset id='UNgue'></fieldset></dl></div>
    • <bdo id='UNgue'></bdo><ul id='UNgue'></ul>

      <tfoot id='UNgue'></tfoot>

        <small id='UNgue'></small><noframes id='UNgue'>

        <legend id='UNgue'><style id='UNgue'><dir id='UNgue'><q id='UNgue'></q></dir></style></legend>
      1. Hive 查询语言中的主键和索引是否可行?

        Primary keys and indexes in Hive query language is poosible or not?(Hive 查询语言中的主键和索引是否可行?)
      2. <small id='yKiiX'></small><noframes id='yKiiX'>

          1. <legend id='yKiiX'><style id='yKiiX'><dir id='yKiiX'><q id='yKiiX'></q></dir></style></legend>
              <tfoot id='yKiiX'></tfoot>

                <tbody id='yKiiX'></tbody>
                <bdo id='yKiiX'></bdo><ul id='yKiiX'></ul>

                  <i id='yKiiX'><tr id='yKiiX'><dt id='yKiiX'><q id='yKiiX'><span id='yKiiX'><b id='yKiiX'><form id='yKiiX'><ins id='yKiiX'></ins><ul id='yKiiX'></ul><sub id='yKiiX'></sub></form><legend id='yKiiX'></legend><bdo id='yKiiX'><pre id='yKiiX'><center id='yKiiX'></center></pre></bdo></b><th id='yKiiX'></th></span></q></dt></tr></i><div id='yKiiX'><tfoot id='yKiiX'></tfoot><dl id='yKiiX'><fieldset id='yKiiX'></fieldset></dl></div>

                  本文介绍了Hive 查询语言中的主键和索引是否可行?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着跟版网的小编来一起学习吧!

                  问题描述

                  我们正在尝试将 oracle 表迁移到 hive 并对其进行处理.目前oracle中的表有primary keyforeign keyunique key约束.

                  We are trying to migrate oracle tables to hive and process them. Currently the tables in oracle has primary key,foreign key and unique key constraints.

                  我们可以在 hive 中复制相同的内容吗?

                  Can we replicate the same in hive?

                  我们正在对如何实施它进行一些分析.

                  We are doing some analysis on how to implement it.

                  推荐答案

                  Hive 索引是在 Hive 0.7.0 (HIVE-417) 并在 Hive 3.0 中删除 (HIVE-18448) 请阅读这里的评论吉拉.该功能在 Hive 中完全没用.这些索引对于大数据来说太昂贵了,RIP.

                  Hive indexing was introduced in Hive 0.7.0 (HIVE-417) and removed in Hive 3.0 (HIVE-18448) Please read comments in this Jira. The feature was completely useless in Hive. These indexes was too expensive for big data, RIP.

                  从 Hive 2.1.0 开始(HIVE-13290)Hive 包括对未经验证的主键和外键约束的支持.这些约束未经过验证,上游系统需要在将数据加载到 Hive 之前确保数据完整性.这些约束对于生成 ER 图和查询的工具很有用.这种未经验证的约束也可用作自记录.如果表有这样的约束,你可以很容易地找出什么应该是 PK.

                  As of Hive 2.1.0 (HIVE-13290) Hive includes support for non-validated primary and foreign key constraints. These constraints are not validated, an upstream system needs to ensure data integrity before it is loaded into Hive. These constraints are useful for tools generating ER diagrams and queries. Also such non-validated constraints are useful as self-documenting. You can easily find out what is supposed to be a PK if the table has such constraint.

                  在 Oracle 数据库 Unique 中,PK 和 FK 约束有索引支持,因此它们可以快速工作并且非常有用.但这不是 Hive 的工作方式和设计目的.

                  In Oracle database Unique, PK and FK constraints are backed with indexes, so they can work fast and are really useful. But this is not how Hive works and what it was designed for.

                  很正常的情况是当您在 HDFS 中加载包含半结构化数据的非常大的文件时.在其上建立索引太昂贵,没有索引来检查 PK 违规只能扫描所有数据.通常你不能在 BigData 中强制执行约束.上游进程可以关心数据的完整性和一致性,但这并不能保证你最终不会在从不同来源加载的一些大表中的 Hive 中发生 PK 违规.

                  Quite normal scenario is when you loaded very big file with semi-structured data in HDFS. Building an index on it is too expensive and without index to check PK violation is possible only to scan all the data. And normally you cannot enforce constraints in BigData. Upstream process can take care about data integrity and consistency but this does not guarantee you finally will not have PK violation in Hive in some big table loaded from different sources.

                  某些文件存储格式,例如 ORC 具有内部轻量级索引"以加快速度-up 过滤和启用谓词下推 (PPD),使用此类索引不会实现 PK 和 FK 约束.这是无法做到的,因为通常您可以在 Hive 中拥有许多属于同一个表的此类文件,并且文件甚至可以具有不同的架构.为 PB 级创建的 Hive,您可以在单次运行中处理 PB 级,数据可以是半结构化的,文件可以有不同的架构.Hadoop 不支持随机写入,如果您想重建索引,这会增加更多的复杂性和成本.

                  Some file storage formats like ORC have internal light weight "indexes" to speed-up filtering and enabling predicate push down (PPD), no PK and FK constraints are implemented using such indexes. This cannot be done because normally you can have many such files belonging to the same table in Hive and files even can have different schemas. Hive created for petabytes and you can process petabytes in single run, data can be semi-structured, files can have different schemas. Hadoop does not support random writes and this adds more complications and cost if you want to rebuild indexes.

                  这篇关于Hive 查询语言中的主键和索引是否可行?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持跟版网!

                  本站部分内容来源互联网,如果有图片或者内容侵犯了您的权益,请联系我们,我们会在确认后第一时间进行删除!

                  相关文档推荐

                  Accessing another user#39;s table within an Oracle Stored Procedure(在 Oracle 存储过程中访问另一个用户的表)
                  How to View Oracle Stored Procedure using SQLPlus?(如何使用 SQLPlus 查看 Oracle 存储过程?)
                  How to Pass Java List of Objects to Oracle Stored Procedure Using MyBatis?(如何使用 MyBatis 将 Java 对象列表传递给 Oracle 存储过程?)
                  how to declare %ROWTYPE of a variable that is a weakly typed SYS_REFCURSOR?(如何声明弱类型 SYS_REFCURSOR 变量的 %ROWTYPE?)
                  Is it possible to pass table name as a parameter in Oracle?(是否可以在 Oracle 中将表名作为参数传递?)
                  How to test an Oracle Stored Procedure with RefCursor return type?(如何使用 RefCursor 返回类型测试 Oracle 存储过程?)

                  • <bdo id='a0ixD'></bdo><ul id='a0ixD'></ul>

                    <i id='a0ixD'><tr id='a0ixD'><dt id='a0ixD'><q id='a0ixD'><span id='a0ixD'><b id='a0ixD'><form id='a0ixD'><ins id='a0ixD'></ins><ul id='a0ixD'></ul><sub id='a0ixD'></sub></form><legend id='a0ixD'></legend><bdo id='a0ixD'><pre id='a0ixD'><center id='a0ixD'></center></pre></bdo></b><th id='a0ixD'></th></span></q></dt></tr></i><div id='a0ixD'><tfoot id='a0ixD'></tfoot><dl id='a0ixD'><fieldset id='a0ixD'></fieldset></dl></div>

                      <small id='a0ixD'></small><noframes id='a0ixD'>

                        <tfoot id='a0ixD'></tfoot>
                        <legend id='a0ixD'><style id='a0ixD'><dir id='a0ixD'><q id='a0ixD'></q></dir></style></legend>
                              <tbody id='a0ixD'></tbody>