Business Intelligence and Analytics applied to Public Housing - - PowerPoint PPT Presentation

business intelligence and analytics applied to public
SMART_READER_LITE
LIVE PREVIEW

Business Intelligence and Analytics applied to Public Housing - - PowerPoint PPT Presentation

Business Intelligence and Analytics applied to Public Housing Doctoral Consortium @ ADBIS 2019 September 8 th , 2019 in Bled, Slovenia 1 University of Lyon, Lyon 2, ERIC EA 3083 2 BIAL-X E. Scholly 1 , 2 , C. Favre 1 , E. Ferey 2 , S. Loudcher 1


slide-1
SLIDE 1

Business Intelligence and Analytics applied to Public Housing

Doctoral Consortium @ ADBIS 2019

September 8th, 2019 in Bled, Slovenia

  • E. Scholly1,2, C. Favre1, E. Ferey2, S. Loudcher1

1University of Lyon, Lyon 2, ERIC EA 3083 2BIAL-X

slide-2
SLIDE 2

Introduction

slide-3
SLIDE 3

Context

A business issue

  • Public Housing : dwellings, occupants, overdue, patrimony, ...

Three main thematics

  • Business Intelligence (BI) : ETLs, data warehouses, OLAP, ...
  • Data Science (DS) : knowledge extraction, Machine Learning, ...
  • Big Data : Volume, Variety, Velocity, ...

How does all this blend ?

1

slide-4
SLIDE 4

Context

A business issue

  • Public Housing : dwellings, occupants, overdue, patrimony, ...

Three main thematics

  • Business Intelligence (BI) : ETLs, data warehouses, OLAP, ...
  • Data Science (DS) : knowledge extraction, Machine Learning, ...
  • Big Data : Volume, Variety, Velocity, ...

How does all this blend ?

1

slide-5
SLIDE 5

Context

A business issue

  • Public Housing : dwellings, occupants, overdue, patrimony, ...

Three main thematics

  • Business Intelligence (BI) : ETLs, data warehouses, OLAP, ...
  • Data Science (DS) : knowledge extraction, Machine Learning, ...
  • Big Data : Volume, Variety, Velocity, ...

How does all this blend ?

1

slide-6
SLIDE 6

Context

A business issue

  • Public Housing : dwellings, occupants, overdue, patrimony, ...

Three main thematics

  • Business Intelligence (BI) : ETLs, data warehouses, OLAP, ...
  • Data Science (DS) : knowledge extraction, Machine Learning, ...
  • Big Data : Volume, Variety, Velocity, ...

How does all this blend ?

1

slide-7
SLIDE 7

Context

A business issue

  • Public Housing : dwellings, occupants, overdue, patrimony, ...

Three main thematics

  • Business Intelligence (BI) : ETLs, data warehouses, OLAP, ...
  • Data Science (DS) : knowledge extraction, Machine Learning, ...
  • Big Data : Volume, Variety, Velocity, ...

How does all this blend ?

1

slide-8
SLIDE 8

Context

A business issue

  • Public Housing : dwellings, occupants, overdue, patrimony, ...

Three main thematics

  • Business Intelligence (BI) : ETLs, data warehouses, OLAP, ...
  • Data Science (DS) : knowledge extraction, Machine Learning, ...
  • Big Data : Volume, Variety, Velocity, ...

How does all this blend ?

1

slide-9
SLIDE 9

Context

A business issue

  • Public Housing : dwellings, occupants, overdue, patrimony, ...

Three main thematics

  • Business Intelligence (BI) : ETLs, data warehouses, OLAP, ...
  • Data Science (DS) : knowledge extraction, Machine Learning, ...
  • Big Data : Volume, Variety, Velocity, ...

How does all this blend ?

1

slide-10
SLIDE 10

Context

A business issue

  • Public Housing : dwellings, occupants, overdue, patrimony, ...

Three main thematics

  • Business Intelligence (BI) : ETLs, data warehouses, OLAP, ...
  • Data Science (DS) : knowledge extraction, Machine Learning, ...
  • Big Data : Volume, Variety, Velocity, ...

How does all this blend ?

1

slide-11
SLIDE 11

Context

A business issue

  • Public Housing : dwellings, occupants, overdue, patrimony, ...

Three main thematics

  • Business Intelligence (BI) : ETLs, data warehouses, OLAP, ...
  • Data Science (DS) : knowledge extraction, Machine Learning, ...
  • Big Data : Volume, Variety, Velocity, ...

→ How does all this blend ?

1

slide-12
SLIDE 12

What data ?

Several data sources

  • 1. Internal data
  • Landlord’s data
  • Dwellings, occupants, overdue, ...
  • Mostly relational data
  • BI analyses, simple DS analyses
  • 2. External data
  • Open data (+ social networks)
  • Environment
  • (possibly) Big Data
  • Advanced DS analyses

2

slide-13
SLIDE 13

What data ?

Several data sources

  • 1. Internal data
  • Landlord’s data
  • Dwellings, occupants, overdue, ...
  • Mostly relational data
  • BI analyses, simple DS analyses
  • 2. External data
  • Open data (+ social networks)
  • Environment
  • (possibly) Big Data
  • Advanced DS analyses

2

slide-14
SLIDE 14

What data ?

Several data sources

  • 1. Internal data
  • Landlord’s data
  • Dwellings, occupants, overdue, ...
  • Mostly relational data
  • BI analyses, simple DS analyses
  • 2. External data
  • Open data (+ social networks)
  • Environment
  • (possibly) Big Data
  • Advanced DS analyses

2

slide-15
SLIDE 15

What data ?

Several data sources

  • 1. Internal data
  • Landlord’s data
  • Dwellings, occupants, overdue, ...
  • Mostly relational data
  • BI analyses, simple DS analyses
  • 2. External data
  • Open data (+ social networks)
  • Environment
  • (possibly) Big Data
  • Advanced DS analyses

2

slide-16
SLIDE 16

What data ?

Several data sources

  • 1. Internal data
  • Landlord’s data
  • Dwellings, occupants, overdue, ...
  • Mostly relational data
  • BI analyses, simple DS analyses
  • 2. External data
  • Open data (+ social networks)
  • Environment
  • (possibly) Big Data
  • Advanced DS analyses

2

slide-17
SLIDE 17

Table of contents

  • 1. Introduction
  • 2. Data storage and management
  • 3. Attractiveness
  • 4. First results and future outcomes

3

slide-18
SLIDE 18

Data storage and management

slide-19
SLIDE 19

Business Intelligence and Analytics

Business Intelligence (BI) Methods and tools for collecting, storing, organizing and analyzing data to support decision-making Business Analytics (BA) The use of Data Science methods on a company’s data What about BI ?

  • BI&A
  • BI & BA
  • BI

BA

[Chen et al., 2012, Larson and Chang, 2016, Mortenson et al., 2015, Baars and Ereth, 2016, Gröger, 2018] 4

slide-20
SLIDE 20

Business Intelligence and Analytics

Business Intelligence (BI) Methods and tools for collecting, storing, organizing and analyzing data to support decision-making Business Analytics (BA) The use of Data Science methods on a company’s data What about BI ?

  • BI&A
  • BI & BA
  • BI

BA

[Chen et al., 2012, Larson and Chang, 2016, Mortenson et al., 2015, Baars and Ereth, 2016, Gröger, 2018] 4

slide-21
SLIDE 21

Business Intelligence and Analytics

Business Intelligence (BI) Methods and tools for collecting, storing, organizing and analyzing data to support decision-making Business Analytics (BA) The use of Data Science methods on a company’s data What about BI ?

  • BI&A
  • BI & BA
  • BI

BA

[Chen et al., 2012, Larson and Chang, 2016, Mortenson et al., 2015, Baars and Ereth, 2016, Gröger, 2018] 4

slide-22
SLIDE 22

Business Intelligence and Analytics

Business Intelligence (BI) Methods and tools for collecting, storing, organizing and analyzing data to support decision-making Business Analytics (BA) The use of Data Science methods on a company’s data What about BI ?

  • BI&A
  • BI & BA
  • BI

BA

[Chen et al., 2012, Larson and Chang, 2016, Mortenson et al., 2015, Baars and Ereth, 2016, Gröger, 2018] 4

slide-23
SLIDE 23

Business Intelligence and Analytics

Business Intelligence (BI) Methods and tools for collecting, storing, organizing and analyzing data to support decision-making Business Analytics (BA) The use of Data Science methods on a company’s data What about BI ?

  • BI&A
  • BI & BA
  • BI → BA

[Chen et al., 2012, Larson and Chang, 2016, Mortenson et al., 2015, Baars and Ereth, 2016, Gröger, 2018] 4

slide-24
SLIDE 24

Data Intelligence

Run BI and BA analyses...

  • Separately
  • Together
  • (possibly) on Big Data

Data Intelligence Perform analyses, simple or advanced, on all types of data How ?

5

slide-25
SLIDE 25

Data Intelligence

Run BI and BA analyses...

  • Separately
  • Together
  • (possibly) on Big Data

Data Intelligence Perform analyses, simple or advanced, on all types of data How ?

5

slide-26
SLIDE 26

Data Intelligence

Run BI and BA analyses...

  • Separately
  • Together
  • (possibly) on Big Data

Data Intelligence Perform analyses, simple or advanced, on all types of data How ?

5

slide-27
SLIDE 27

Data Intelligence

Run BI and BA analyses...

  • Separately
  • Together
  • (possibly) on Big Data

Data Intelligence Perform analyses, simple or advanced, on all types of data How ?

5

slide-28
SLIDE 28

Data Intelligence

Run BI and BA analyses...

  • Separately
  • Together
  • (possibly) on Big Data

Data Intelligence Perform analyses, simple or advanced, on all types of data How ?

5

slide-29
SLIDE 29

Data Intelligence

Run BI and BA analyses...

  • Separately
  • Together
  • (possibly) on Big Data

Data Intelligence Perform analyses, simple or advanced, on all types of data How ?

5

slide-30
SLIDE 30

Data Intelligence

Run BI and BA analyses...

  • Separately
  • Together
  • (possibly) on Big Data

Data Intelligence Perform analyses, simple or advanced, on all types of data → How ?

5

slide-31
SLIDE 31

Data Intelligence in practice

6

slide-32
SLIDE 32

Data Lakes

Data Lake [Dixon, 2010] A data lake is a large repository of heterogeneous raw data, supplied by external data sources and from which various analyses can be performed. Two main characteristics

  • Schema-on-read
  • Data variety

Need for a metadata system Big research field

[Miloslavskaya and Tolstoy, 2016] 7

slide-33
SLIDE 33

Data Lakes

Data Lake [Dixon, 2010] A data lake is a large repository of heterogeneous raw data, supplied by external data sources and from which various analyses can be performed. Two main characteristics

  • Schema-on-read
  • Data variety

Need for a metadata system Big research field

[Miloslavskaya and Tolstoy, 2016] 7

slide-34
SLIDE 34

Data Lakes

Data Lake [Dixon, 2010] A data lake is a large repository of heterogeneous raw data, supplied by external data sources and from which various analyses can be performed. Two main characteristics

  • Schema-on-read
  • Data variety

Need for a metadata system Big research field

[Miloslavskaya and Tolstoy, 2016] 7

slide-35
SLIDE 35

Data Lakes

Data Lake [Dixon, 2010] A data lake is a large repository of heterogeneous raw data, supplied by external data sources and from which various analyses can be performed. Two main characteristics

  • Schema-on-read
  • Data variety

Need for a metadata system Big research field

[Miloslavskaya and Tolstoy, 2016] 7

slide-36
SLIDE 36

Data Lakes

Data Lake [Dixon, 2010] A data lake is a large repository of heterogeneous raw data, supplied by external data sources and from which various analyses can be performed. Two main characteristics

  • Schema-on-read
  • Data variety

→ Need for a metadata system Big research field

[Miloslavskaya and Tolstoy, 2016] 7

slide-37
SLIDE 37

Data Lakes

Data Lake [Dixon, 2010] A data lake is a large repository of heterogeneous raw data, supplied by external data sources and from which various analyses can be performed. Two main characteristics

  • Schema-on-read
  • Data variety

→ Need for a metadata system Big research field

[Miloslavskaya and Tolstoy, 2016] 7

slide-38
SLIDE 38

Attractiveness

slide-39
SLIDE 39

Data Intelligence in practice

8

slide-40
SLIDE 40

Defining attractiveness

Attractiveness of what ?

  • 1. Dwelling
  • 2. Residency
  • 3. Neighborhood

Strategic Patrimony Plan Advanced indicators

  • Machine Learning algorithms
  • Back-feeding the lake
  • Enrich BI analyses

9

slide-41
SLIDE 41

Defining attractiveness

Attractiveness of what ?

  • 1. Dwelling
  • 2. Residency
  • 3. Neighborhood

Strategic Patrimony Plan Advanced indicators

  • Machine Learning algorithms
  • Back-feeding the lake
  • Enrich BI analyses

9

slide-42
SLIDE 42

Defining attractiveness

Attractiveness of what ?

  • 1. Dwelling
  • 2. Residency
  • 3. Neighborhood

Strategic Patrimony Plan Advanced indicators

  • Machine Learning algorithms
  • Back-feeding the lake
  • Enrich BI analyses

9

slide-43
SLIDE 43

Defining attractiveness

Attractiveness of what ?

  • 1. Dwelling
  • 2. Residency
  • 3. Neighborhood

Strategic Patrimony Plan Advanced indicators

  • Machine Learning algorithms
  • Back-feeding the lake
  • Enrich BI analyses

9

slide-44
SLIDE 44

Defining attractiveness

Attractiveness of what ?

  • 1. Dwelling (internal)
  • 2. Residency
  • 3. Neighborhood

Strategic Patrimony Plan Advanced indicators

  • Machine Learning algorithms
  • Back-feeding the lake
  • Enrich BI analyses

9

slide-45
SLIDE 45

Defining attractiveness

Attractiveness of what ?

  • 1. Dwelling (internal)
  • 2. Residency (internal - external)
  • 3. Neighborhood

Strategic Patrimony Plan Advanced indicators

  • Machine Learning algorithms
  • Back-feeding the lake
  • Enrich BI analyses

9

slide-46
SLIDE 46

Defining attractiveness

Attractiveness of what ?

  • 1. Dwelling (internal)
  • 2. Residency (internal - external)
  • 3. Neighborhood (external)

Strategic Patrimony Plan Advanced indicators

  • Machine Learning algorithms
  • Back-feeding the lake
  • Enrich BI analyses

9

slide-47
SLIDE 47

Defining attractiveness

Attractiveness of what ?

  • 1. Dwelling (internal)
  • 2. Residency (internal - external)
  • 3. Neighborhood (external)

→ Strategic Patrimony Plan Advanced indicators

  • Machine Learning algorithms
  • Back-feeding the lake
  • Enrich BI analyses

9

slide-48
SLIDE 48

Defining attractiveness

Attractiveness of what ?

  • 1. Dwelling (internal)
  • 2. Residency (internal - external)
  • 3. Neighborhood (external)

→ Strategic Patrimony Plan Advanced indicators

  • Machine Learning algorithms
  • Back-feeding the lake
  • Enrich BI analyses

9

slide-49
SLIDE 49

Defining attractiveness

Attractiveness of what ?

  • 1. Dwelling (internal)
  • 2. Residency (internal - external)
  • 3. Neighborhood (external)

→ Strategic Patrimony Plan Advanced indicators

  • Machine Learning algorithms
  • Back-feeding the lake
  • Enrich BI analyses

9

slide-50
SLIDE 50

Defining attractiveness

Attractiveness of what ?

  • 1. Dwelling (internal)
  • 2. Residency (internal - external)
  • 3. Neighborhood (external)

→ Strategic Patrimony Plan Advanced indicators

  • Machine Learning algorithms
  • Back-feeding the lake
  • Enrich BI analyses

9

slide-51
SLIDE 51

Defining attractiveness

Attractiveness of what ?

  • 1. Dwelling (internal)
  • 2. Residency (internal - external)
  • 3. Neighborhood (external)

→ Strategic Patrimony Plan Advanced indicators

  • Machine Learning algorithms
  • Back-feeding the lake
  • Enrich BI analyses

9

slide-52
SLIDE 52

First results and future outcomes

slide-53
SLIDE 53

First contribution

Work done with P. N. Sawadogo [Sawadogo et al., 2019, Scholly et al., 2019]

  • Our definition of a Data Lake
  • Key features for metadata systems
  • Metadata typology in three categories
  • MEtadata model for DAta Lakes (MEDAL)

Presented at 4 PM in this room !

10

slide-54
SLIDE 54

First contribution

Work done with P. N. Sawadogo [Sawadogo et al., 2019, Scholly et al., 2019]

  • Our definition of a Data Lake
  • Key features for metadata systems
  • Metadata typology in three categories
  • MEtadata model for DAta Lakes (MEDAL)

Presented at 4 PM in this room !

10

slide-55
SLIDE 55

What’s next ?

Work in progress

  • Implementation(s) of MEDAL
  • Retrieve all data
  • Development of a complete data lake
  • Tests and comparisons

11

slide-56
SLIDE 56

What’s next ?

Work in progress

  • Implementation(s) of MEDAL
  • Retrieve all data
  • Development of a complete data lake
  • Tests and comparisons

11

slide-57
SLIDE 57

What’s next ?

Work in progress

  • Implementation(s) of MEDAL
  • Retrieve all data
  • Development of a complete data lake
  • Tests and comparisons

11

slide-58
SLIDE 58

What’s next ?

Work in progress

  • Implementation(s) of MEDAL
  • Retrieve all data
  • Development of a complete data lake
  • Tests and comparisons

11

slide-59
SLIDE 59

What’s next ?

Work in progress

  • Implementation(s) of MEDAL
  • Retrieve all data
  • Development of a complete data lake
  • Tests and comparisons

11

slide-60
SLIDE 60

Thank you for your attention!

Questions?

slide-61
SLIDE 61

References i

Baars, H. and Ereth, J. (2016). From data warehouses to analytical atoms-the internet of things as a centrifugal force in business intelligence and analytics. In 24th European Conference on Information Systems (ECIS), Istanbul, Turkey, page ResearchPaper3. Chen, H., Chiang, R. H., and Storey, V. C. (2012). Business intelligence and analytics: from big data to big impact. MIS quarterly, pages 1165–1188.

slide-62
SLIDE 62

References ii

Dixon, J. (2010). Pentaho, Hadoop, and Data Lakes. https://jamesdixon.wordpress.com/2010/10/14/pentaho- hadoop-and-data-lakes/. Gröger, C. (2018). Building an industry 4.0 analytics platform. Datenbank-Spektrum, 18(1):5–14. Larson, D. and Chang, V. (2016). A review and future direction of agile, business intelligence, analytics and data science. International Journal of Information Management, 36(5):700–710.

slide-63
SLIDE 63

References iii

Miloslavskaya, N. and Tolstoy, A. (2016). Big Data, Fast Data and Data Lake Concepts. In 7th Annual International Conference on Biologically Inspired Cognitive Architectures (BICA 2016), NY, USA, volume 88 of Procedia Computer Science, pages 1–6. Mortenson, M. J., Doherty, N. F., and Robinson, S. (2015). Operational research from taylorism to terabytes: A research agenda for the analytics age. European Journal of Operational Research, 241(3):583–595. Sawadogo, P., Scholly, É., Favre, C., Ferey, É., Loudcher, S., and Darmont, J. (2019). Metadata systems for data lakes: Models and features.

slide-64
SLIDE 64

References iv

Scholly, E., Sawadogo, P., Favre, C., Ferey, E., Loudcher, S., and Darmont, J. (2019). Systèmes de métadonnées d’un lac de données: modélisation et fonctionnalités.