Presto migrating from Hive

2021-07-09

Hive向Presto迁移需要注意的tips

区分PrestoDB/PrestoSQL

就在 2020 年 12 月 27 日，prestosql 与 facebook 正式分裂,并改名为trino。分裂之前和之后的官网分别是：https://prestosql.io/ 和 https://trino.io。

Insert overwrite不可用

presto 不支持 insert overwrite。

目前关于表结构的创建和删除，比如create、drop等操作依然用hive实现，为了保持与现有表结构的一致性。

而select、insert等操作使用presto来完成。

对于hive中insert overwrite table而言，可以使用drop table、create table、insert into table的方式替代。

对于hive中insert overwrite directory而言，可以使用drop table、create external table（location指定到需要的路径）、insert into table来代替。

external table

CREATE TABLE [ IF NOT EXISTS ]
table_name (
  { column_name data_type [ NOT NULL ]
      [ COMMENT comment ]
      [ WITH ( property_name = expression [, ...] ) ]
  | LIKE existing_table_name
      [ { INCLUDING | EXCLUDING } PROPERTIES ]
  }
  [, ...]
)
[ COMMENT table_comment ]
[ WITH ( property_name = expression [, ...] ) ]

可以看到presto没有外部表这个概念，在hive和presto的兼容性上存在一定问题，尽可能用hive来执行ddl语句。

同时如果表格是用hive创建的外部表，presto是无法写入数据的，会报错Cannot write to non-managed Hive table。

关于partition的细节

利用hive建立table以及分区后，执行presto的inset语句，会在hdfs对应目录下新建相关的分区路径，但是需要手动将路径绑定到hive表的对应分区上，否则hive识别不了分区中的数据。

错误例子如下

CREATE TABLE IF NOT EXISTS tmp.presto_tmp (
  ID  BIGINT
  , cityid  BIGINT
)
PARTITIONED BY (date_jobid STRING);

-- 执行下条的删除分区不会报错
alter table tmp.presto_tmp drop partition (date_jobid='temp_id');

insert into tmp.presto xxxxxxxxxxx where date_jobid='temp_id';

-- 此时hive执行select时识别不到对应分区，但是删除分区时又会识别到hdfs上分区的路径，并且产生无法删除的错误
alter table tmp.presto_tmp drop partition (date_jobid='temp_id');

解决方案如下

CREATE TABLE IF NOT EXISTS tmp.presto_tmp (
  ID  BIGINT
  , cityid  BIGINT
)
PARTITIONED BY (date_jobid STRING);

alter table tmp.presto_tmp drop partition (date_jobid='temp_id');

insert into tmp.presto_tmp xxxxxxxxxxx where date_jobid='temp_id';

-- 此时绑定分区到hive表上
alter table tmp.presto_tmp add IF NOT EXISTS partition(date_jobid='temp_id');

alter table tmp.presto_tmp drop partition (date_jobid='temp_id');

直接执行sql是可以的，但是如果由shell脚本执行sql语句可能会因为权限问题导致失败。

regexp_replace只能修改String类型

hive中regexp_replace可以直接修改tinyint类型，因为tinyint在hive中可以隐式转换成string，但是presto不可以，需要先用cast转换类型。

整型除法运算

hive中 select 1/10=0.1，而presto中结果为0，解决方法如下

1	select cast(1 as DOUBLE)/10;

官方介绍如下

Use CAST when dividing integers

Trino follows the standard behavior of performing integer division when dividing two integers. For example, dividing 7 by 2 will result in 3, not 3.5. To perform floating point division on two integers, cast one of them to a double:

1	SELECT CAST(5 AS DOUBLE) / 2

Array类型的细节

在hive中，用圆括号：array(‘A’,’B’,’C’)，下标从0开始

但是presto中，用方括号：array[‘A’,’B’,’C’]，下标从1开始

LATERAL VIEW explode用CROSS JOIN UNNEST替换

官方文档如下

Use UNNEST to expand arrays and maps

Trino supports UNNEST for expanding arrays and maps. Use UNNEST instead of LATERAL VIEW explode().

Hive query:

1
2
3

SELECT student, score
FROM tests
LATERAL VIEW explode(scores) t AS score;

Trino query:

1
2
3

SELECT student, score
FROM tests
CROSS JOIN UNNEST(scores) AS t (score);

LATERAL VIEW OUTER explode的替换

方法一

1	LEFT JOIN UNNEST(x) t(alias) ON true;

This is available since Presto 319

方法二

1	CROSS JOIN UNNEST(coalesce(x,array[null]))

目前采用这种形式

用coalesce替代nvl

presto没有nvl函数

Hive query:

1	nvl(my_field,0)

Trino query:

1	coalesce(my_field,0)

列转行

需要单独去下文的链接去看一下，主要是聚合函数的名称不同

hive

1	select collect_set(col) from table

presto

https://trino.io/docs/current/functions/aggregate.html

1	select array_agg(col) from table

不推荐用聚合函数来直接实现列转行，开销过大，可以尝试结合实际场景使用别的方法。

用approx_distinct代替COUNT DISTINCT

WARNING: COUNT(DISTINCT xxx) can be a very expensive operation when the cardinality is high for xxx. In most scenarios, using approx_distinct instead would be enough

Returns the approximate number of distinct input values. This function provides an approximation of count(DISTINCT x). Zero is returned if all input values are null.

This function should produce a standard error of no more than e, which is the standard deviation of the (approximately normal) error distribution over all possible sets. It does not guarantee an upper bound on the error for any specific input set. The current implementation of this function requires that e be in the range of [0.0040625, 0.26000].

具体效果需要运行后进行对比，不要轻易使用，有一定误差，目前测试下来千位以下的效果都不是很好，遇到超大数据可以尝试使用。

使用Join语句时将大表放在左边

Presto中 join 的默认算法是broadcast join，即将 join 左边的表分割到多个 worker ，然后将join 右边的表数据整个复制一份发送到每个worker进行计算。如果右边的表数据量太大，则可能会报内存溢出错误。

1 2	[GOOD] SELECT ... FROM large_table l join small_table s on l.id = s.id [BAD] SELECT ... FROM small_table s join large_table l on l.id = s.id

用regexp_like代替多个like语句

Presto查询优化器没有对多个 like 语句进行优化，使用regexp_like对性能有较大提升。

[GOOD]
SELECT   ... FROM   access WHERE   regexp_like(method, 'GET|POST|PUT|DELETE')
[BAD]
SELECT   ... FROM   access WHERE   method LIKE '%GET%' OR    method LIKE '%POST%' OR   method LIKE '%PUT%' OR   method LIKE '%DELETE%'

用GROUP BY语句时，GROUP BY的目标可用数字代替

在Presto SQL中，GROUP BY语句需要与SELECT语句中的表达式保持一致，不然会提示语法错误。

例如：

1
2
3

SELECT TD_TIME_FORMAT(time, 'yyyy-MM-dd HH', 'PDT') hour, count(*) cnt
FROM my_table
GROUP BY TD_TIME_FORMAT(time, 'yyyy-MM-dd HH', 'PDT')

上面的SQL语句的GROUP BY部分可以用GROUP BY 1，2，3 …来表示

1
2
3

SELECT TD_TIME_FORMAT(time, 'yyyy-MM-dd HH', 'PDT') hour, count(*) cnt
FROM my_table
GROUP BY 1

group by字句优化

合理安排group by字句中字段的顺序可以稍微提升查询性能，尤其是在一个特别大的表中两个字段值数量差别特别打的时候，如果group by子句设置不好的话，会导致查询内存溢出。

优化策略:如果group by子句中包含两个字段，其中column1中的distinct值的数量要远大于column2中distinct值的数量，则需要将column1放在column2的左边，也就是 group by字句中的字段按照每个字段distinct数据的多少进行降序排序。

大表放在join字句左边

在默认情况下，presto使用distributed hash join算法，在这种算法中，join左右两边的表都会根据join键的值进行分区。左表的每个分区都会被通过网络传入到右表所在分区的worker节点上。也就是说，在进行join关联的时候，会先把右表的所有分区全部分布到各个计算节点上保存，然后等待将左表中的各个分区依次通过网络传输stream 到相应的计算节点上进行计算。由于右表的所有分区需要全部分布到各个节点上进行存储，所以有一个限制：就是集群中所有内存的代销一定要大于右表的大小。

如果你在执行join查询的时候看到错误：task exceeded max memory size，那么这经常意味着join连接的右表大于集群所有内存的大小。presto不会自动进行join两边表顺序的优化，因此在执行join查询的时候，请确保大表放在join的左边，小表放在join右边。

本文作者： xiaoyuyu
本文链接： http://woaixiaoyuyu.github.io/2021/07/09/Presto%20migrating%20from%20Hive/
版权声明： 本作品采用知识共享署名-非商业性使用-相同方式共享 4.0 国际许可协议进行许可。转载请注明出处！