数据仓库(10)数仓拉链表开发实例

[TencentCloudSDKException] code:FailedOperation.ServiceIsolate message:service is stopped due to arrears, please recharge your account in Tencent Cloud requestId:11bbbe0d-0533-425d-ab86-ac9de25c4f94

[En]

[TencentCloudSDKException] code:FailedOperation.ServiceIsolate message:service is stopped due to arrears, please recharge your account in Tencent Cloud requestId:22a2854f-1d69-4411-a860-25864586838f

维护历史状态,以及最新状态数据的一种表,拉链表根据拉链粒度的不同,实际上相当于快照,只不过做了优化,去除了一部分不变的记录,通过拉链表可以很方便的还原出拉链时点的客户记录。

这里用商品价格的变化作为例子,具体的开发过程要按实际的来,不能照搬代码,编程重要的是了解背后的思路和原理,而不是ctrl+c和ctrl+v。那对我们学习提升的帮助有限,虽然可能对完成工作的效率帮助很大。

在开始介绍之前,这里的数据仓库的环境是HIVE。

首先看看原始的数据:

数据仓库(10)数仓拉链表开发实例

[TencentCloudSDKException] code:FailedOperation.ServiceIsolate message:service is stopped due to arrears, please recharge your account in Tencent Cloud requestId:f38fb92f-c88e-4750-8067-c88954ae3d44

[En]

[TencentCloudSDKException] code:FailedOperation.ServiceIsolate message:service is stopped due to arrears, please recharge your account in Tencent Cloud requestId:f7920e47-9f8a-4407-a03b-adf782997899

我们这里的思路是这样的,将最新的商品记录插入历史拉链表中,然后我们通过HIVE的窗口行数,按照end_date排序,然后分别取下一条的sale_price和end_date,然后再判断本条的价格和下一条的价格是否相等,如果是一样的,那么就把end_date改为下一条的end_date,最后做去重处理,然后就得到我们想要的数据了。

说了这么多,我觉得还是把sql贴出来会好一些,代码是最好的语言。

talk is cheap,show me the code。

-- 商品原始表这里取名goods_table
select spu_id,
       min(start_date) as start_date,
       end_date as end_date,
       sale_price
from
  (select spu_id,
          start_date,
          if(sale_price = lead_sale_price,lead_end_date,end_date) as end_date,
          sale_price
   from
     ( select spu_id,
              start_date,
              end_date,
              sale_price,
              lead(sale_price,1,null) over(partition by spu_id order by end_date) as lead_sale_price,
              lead(end_date) over(partition by spu_id order by end_date) as lead_end_date
      from goods_table ) t) t
group by spu_id,
         end_date,
         sale_price ;

[TencentCloudSDKException] code:FailedOperation.ServiceIsolate message:service is stopped due to arrears, please recharge your account in Tencent Cloud requestId:2fef9b7d-859a-4d02-a32c-c8e40e2d87cd

[En]

[TencentCloudSDKException] code:FailedOperation.ServiceIsolate message:service is stopped due to arrears, please recharge your account in Tencent Cloud requestId:e6de7761-dee0-4a69-8161-6f8edfae5f8d

数据仓库(10)数仓拉链表开发实例

[TencentCloudSDKException] code:FailedOperation.ServiceIsolate message:service is stopped due to arrears, please recharge your account in Tencent Cloud requestId:d3b4b808-6203-40b2-bc17-0f656b21d749

[En]

[TencentCloudSDKException] code:FailedOperation.ServiceIsolate message:service is stopped due to arrears, please recharge your account in Tencent Cloud requestId:8892b0f4-c96f-44cc-8144-a95d0d25e2cd

参考文章:数据仓库(10)数仓拉链表开发实例

Original: https://www.cnblogs.com/the-pig-of-zf/p/16230544.html
Author: 张飞的猪
Title: 数据仓库(10)数仓拉链表开发实例

原创文章受到原创版权保护。转载请注明出处:https://www.johngo689.com/561895/

转载文章受原作者版权保护。转载请注明原作者出处!

(0)

大家都在看

亲爱的 Coder【最近整理,可免费获取】👉 最新必读书单  | 👏 面试题下载  | 🌎 免费的AI知识星球