Big Telco, Bigger DW Demands: Moving Towards SQL-on-Hadoop Keuntae - - PowerPoint PPT Presentation

big telco bigger dw demands moving towards sql on hadoop
SMART_READER_LITE
LIVE PREVIEW

Big Telco, Bigger DW Demands: Moving Towards SQL-on-Hadoop Keuntae - - PowerPoint PPT Presentation

Big Telco, Bigger DW Demands: Moving Towards SQL-on-Hadoop Keuntae Park IT Manager of SK Telecom, South Koreas largest wireless communications provider Work on commercial products (~12) T-FS: Distributed File System


slide-1
SLIDE 1

Big Telco, Bigger DW Demands: Moving Towards SQL-on-Hadoop

slide-2
SLIDE 2

Keuntae Park

  • IT Manager of SK Telecom, South Korea’s largest

wireless communications provider

  • Work on commercial products (~’12)

– T-FS: Distributed File System – Windows compatible layer on TimOS – T-MR: on-demand MapReduce service like E-MR

  • Open source activity (‘13~)

– Committer of Apache Tajo project

slide-3
SLIDE 3

Overview

  • Background

– Telco requirements

  • Before Tajo

– Commercial product – Open source (Hadoop) outsourcing

  • After Tajo

– Issues & solutions – Performance

  • win-win between community and company
  • Future Works
slide-4
SLIDE 4

Telco data characteristics

  • Huge amount of data

– 40 TB/day (compressed) – 15 PB (estimated, end of 2014)

  • Report & OLAP ad-hoc query

– Filtering – Summary – BI tools

slide-5
SLIDE 5

Requirements - different size, different speed

Filtering & aggregation Summary Data re- construction BI report Ad-hoc Query Target accumulated for 5 minutes daily sum of filtered data entire summary data mart data summary data Frequency every 5 minutes daily or monthly non-regularly (rare) ah-hoc ah-hoc Amount of data terabytes hundreds of terabytes petabytes tens of gigabytes tens of terabytes Response time within a minute within a hour no strict deadline within two seconds within a hour

slide-6
SLIDE 6

Previous approach - DBMS

based on MPP DBMS

slide-7
SLIDE 7

Previous approach - DBMS

based on MPP DBMS

Too Expensive Not Scalable

slide-8
SLIDE 8

Previous approach - DBMS

based on MPP DBMS

Too Expensive Not Scalable

slide-9
SLIDE 9

Previous approach - DBMS

based on MPP DBMS

Too Expensive Not Scalable

slide-10
SLIDE 10

Previous approach - Hadoop(MapReduce, Hive) + DBMS

MPP DBMS Hadoop

slide-11
SLIDE 11

Previous approach - Hadoop(MapReduce, Hive) + DBMS

MPP DBMS Hadoop

Working (but…)

slide-12
SLIDE 12

Still has Problems

  • Hadoop outsourcing

– quality of outcome is not good (actually bad) – communication overhead – hard to reflect requirements on open source

  • Data Warehouse and Mart becomes bigger
slide-13
SLIDE 13

Solution - Tajo!!

  • It can replace both DBMS and Hadoop

– High throughput for batch processing – Low latency for ad-hoc queries – ANSI SQL compatible

  • Can do by myself

– very open community

  • easily make issues about what I really need

– fast growing

  • issues solved very fast
slide-14
SLIDE 14

About Tajo

  • Tajo (since 2010)

– Big Data Warehouse System on Hadoop – Apache top-level project (entered the ASF in March 2013)

  • Features

– SQL standard compliance – Fully distributed SQL query processing – HDFS as a primary storage – Relational model (will be extended to nested model in the future) – ETL as well as low-latency relational query processing (100 ms ~)

  • News

– 0.2-incubating: released November 2013 – graduation to top-level: April 2014

slide-15
SLIDE 15

Tajo logical optimizer

  • Cost-based join ordering
  • Projection/Filter push down & Duplicated expression removal

Table A ID QTY Date Table B ID Price Tax

sel_> sel_< aggr_sum1 aggr_sum2

GroupBy Filter Join Projection

slide-16
SLIDE 16

Tajo logical optimizer

  • Cost-based join ordering
  • Projection/Filter push down & Duplicated expression removal

Table A ID QTY Date Table B ID Price Tax

sel_> sel_< aggr_sum1 aggr_sum2

GroupBy Filter Join Projection Table A ID QTY Date Table B ID Price Tax

sel_> sel_< aggr_sum2

GroupBy Filter Join Projection

aggr_sum1

slide-17
SLIDE 17

Tajo progressive optimization

  • dynamically adjust number of tasks

input data execution block

  • execution block

intermediate data … unknown priorly how many tasks
 (and workers)?

  • estimate data size 


at planning time

  • check size and adjust plan 


at execution time

  • shuffle intermediate data
  • ver workers uniformly

shuffled data shuffled data shuffled data

slide-18
SLIDE 18

Tajo progressive optimization

  • dynamically adjust join order or type

Hash-Join Hash-Join

slide-19
SLIDE 19

Tajo progressive optimization

  • dynamically adjust join order or type

Hash-Join Hash-Join Broadcast-Join

slide-20
SLIDE 20

Tajo - what is improved past 9 months ?

  • Resource Manager
  • Scheduler & Storage Manager
  • Data types & Functions
  • SQL Interface
  • Management
slide-21
SLIDE 21

Tajo resource manager

Tajo Master Tajo Worker
 (as a query master) Tajo Worker
 (as a worker) Tajo Worker
 (as a worker) Tajo Worker
 (as a worker)

TAJO-127 without YARN

  • Fine resource allocation
slide-22
SLIDE 22

Tajo resource manager

Tajo Master Tajo Worker
 (as a query master) Tajo Worker
 (as a worker) Tajo Worker
 (as a worker) Tajo Worker
 (as a worker)

TAJO-127 without YARN

Tajo Master Query Master Tajo Worker
 (as a worker) Tajo Worker
 (as a worker) Tajo Worker

TAJO-275 separating Query master

  • Fine resource allocation
slide-23
SLIDE 23

Tajo resource manager

Tajo Master Tajo Worker
 (as a query master) Tajo Worker
 (as a worker) Tajo Worker
 (as a worker) Tajo Worker
 (as a worker)

TAJO-127 without YARN

Tajo Master Query Master Tajo Worker
 (as a worker) Tajo Worker
 (as a worker) Tajo Worker

TAJO-275 separating Query master

Tajo Master Query Master

TAJO-317 elaborate resource allocation

Tajo Worker (I/O-intensive) Tajo Worker (I/O-intensive) Tajo Worker (I/O-intensive) Tajo Worker (I/O-intensive) Tajo Worker (I/O-intensive) Tajo Worker (CPU/memory)

  • Fine resource allocation
slide-24
SLIDE 24

Scheduler & Storage manager

Tajo Worker Thread Tajo Worker Thread Tajo Worker Thread Tajo Worker Thread Tajo Worker Thread Tajo Worker Thread

  • disk-aware scheduling (volume info from HDFS-3672)
slide-25
SLIDE 25

Scheduler & Storage manager

Tajo Worker Thread Tajo Worker Thread Tajo Worker Thread Tajo Worker Thread Tajo Worker Thread Tajo Worker Thread Tajo Worker Thread Tajo Worker Thread Tajo Worker Thread Tajo Worker Thread Tajo Worker Thread Tajo Worker Thread Storage Manager

TAJO-84 considering disk load balance TAJO-178 asynchronous scan

  • disk-aware scheduling (volume info from HDFS-3672)
slide-26
SLIDE 26

Scheduler & Storage manager

Tajo Worker Thread Tajo Worker Thread Tajo Worker Thread Tajo Worker Thread Tajo Worker Thread Tajo Worker Thread Tajo Worker Thread Tajo Worker Thread Tajo Worker Thread Tajo Worker Thread Tajo Worker Thread Tajo Worker Thread Storage Manager

TAJO-84 considering disk load balance TAJO-178 asynchronous scan

  • disk-aware scheduling (volume info from HDFS-3672)

TAJO-200 RCFile

  • TAJO-30

Parquet TAJO-134 text compression

(gzip, snappy, lz4, bzip2)

TAJO-435 intermediate file

slide-27
SLIDE 27

Functions & data types

  • supporting more functions and UDFs

Tajo Master function1 function2 function3

registered at startup (class name is coded in source)

slide-28
SLIDE 28

Functions & data types

  • supporting more functions and UDFs

Tajo Master function1 function2 function3

registered at startup (class name is coded in source)

Tajo Master function

user defined function user defined function @Description( functionName = "to_timestamp", description = "Convert UNIX epoch to time stamp", example = "> SELECT to_timestamp(1389071574);\n" + "2014-01-07 14:12:54", returnType = TajoDataTypes.Type.TIMESTAMP, paramTypes = {@ParamTypes(paramTypes = {TajoDataTypes.Type.INT4}), @ParamTypes(paramTypes = {TajoDataTypes.Type.INT8})} )

TAJO-408 Improve function system

function

slide-29
SLIDE 29

Functions & data types

  • supporting more functions and UDFs

Tajo Master function1 function2 function3

registered at startup (class name is coded in source)

Tajo Master function

user defined function user defined function @Description( functionName = "to_timestamp", description = "Convert UNIX epoch to time stamp", example = "> SELECT to_timestamp(1389071574);\n" + "2014-01-07 14:12:54", returnType = TajoDataTypes.Type.TIMESTAMP, paramTypes = {@ParamTypes(paramTypes = {TajoDataTypes.Type.INT4}), @ParamTypes(paramTypes = {TajoDataTypes.Type.INT8})} )

TAJO-408 Improve function system

function

description runtime registration automatic registration

slide-30
SLIDE 30

Functions & data types

  • supporting more functions and UDFs

Tajo Master function1 function2 function3

registered at startup (class name is coded in source)

Tajo Master function

user defined function user defined function @Description( functionName = "to_timestamp", description = "Convert UNIX epoch to time stamp", example = "> SELECT to_timestamp(1389071574);\n" + "2014-01-07 14:12:54", returnType = TajoDataTypes.Type.TIMESTAMP, paramTypes = {@ParamTypes(paramTypes = {TajoDataTypes.Type.INT4}), @ParamTypes(paramTypes = {TajoDataTypes.Type.INT8})} )

TAJO-408 Improve function system

function

description runtime registration automatic registration TAJO-52 standard SQL data types

slide-31
SLIDE 31

JDBC Driver, HCatalog

Query Master HiveQL parser SQL parser

Tajo Algebra expression

HiveQL ANSI SQL

JDBC HCatalog

TAJO-16, 433 Hive metastore TAJO-176 JDBC Driver TAJO-101 HiveQL converter

slide-32
SLIDE 32

Management

TAJO-239 Improving Web UI

slide-33
SLIDE 33

Management

TAJO-564 Execution block progress

slide-34
SLIDE 34

Management

TAJO-589 Task progress

slide-35
SLIDE 35

Management

TAJO-468 Task detail info

slide-36
SLIDE 36

Management

TAJO-474 Task admin utility

slide-37
SLIDE 37

And lots of Performance enhancement

TAJO-725 Broadcast JOIN should supports multiple tables TAJO-717 Improve file splitting for large number of splits TAJO-601 Improve distinct aggregation query processing TAJO-584 Improve distributed merge sort TAJO-36 Improve ExternalSortExec with N-merge sort and final pass omission TAJO-345 MergeScanner should support projectable storages …

slide-38
SLIDE 38

Performance

  • TPC-H
slide-39
SLIDE 39

Performance

  • OLAP reporting - relatively small data
slide-40
SLIDE 40

win-win between company and community

  • Community boom up
slide-41
SLIDE 41

win-win between company and community

  • Community boom up

13 30

slide-42
SLIDE 42

win-win between company and community

  • Test in real working cluster

– Mainly focusing on the scalability test 
 & integration with existing IT systems – Finding bugs and function requirements, also

slide-43
SLIDE 43

win-win between company and community

  • Test in real working cluster

– Mainly focusing on the scalability test 
 & integration with existing IT systems – Finding bugs and function requirements, also

TAJO-691 HashJoin or HashAggregation is too slow if there is many unique keys TAJO-675 maximum frame size of frameDecoder should be increased TAJO-673 Assign proper number of tasks when inserting into partitioned table TAJO-650 Repartitioner::scheduleHashShuffledFetches should adjust the number of tasks TAJO-647 Work unbalance on disk scheduling of DefaultScheduler TAJO-292 Too many intermediate partition files TAJO-283 Add table partitioning TAJO-592 HCatalogStore should supports RCFile and default hive field delimiter. …

slide-44
SLIDE 44

win-win between company and community

slide-45
SLIDE 45

win-win between company and community

  • efficient development and operation
  • human networking
  • brand value up - recruiting
slide-46
SLIDE 46

Future Works

  • Nested data model (parquet model)
  • more SQL compatible

– window functions, IN, EXIST

  • Multi-tenancy
  • push shuffle (no materialization)

– use selectively between push and pull shuffle

  • push shuffle: performance
  • pull shuffle: resilience, schedulability
slide-47
SLIDE 47

Q & A

  • GettingStarted

– http://tajo.apache.org/tajo-0.2.0-doc.html#GettingStarted

  • Checkoutthedevelopmentbranch

– http://tajo.apache.org/downloads.html

  • Jira­—IssueTracker

– https://issues.apache.org/jira/browse/TAJO

  • Jointhemailinglist

– dev@tajo.apache.org