Using Naiad to Analyze Twitter Data in Batch and Real-time George - - PowerPoint PPT Presentation

using naiad to analyze twitter data in batch and real time
SMART_READER_LITE
LIVE PREVIEW

Using Naiad to Analyze Twitter Data in Batch and Real-time George - - PowerPoint PPT Presentation

Using Naiad to Analyze Twitter Data in Batch and Real-time George Wort University of Cambridge 2017 Naiad Timely Dataflow System. Batch Processing. Stream Processing. Graph Processing. Supports iterative and incremental


slide-1
SLIDE 1

Using Naiad to Analyze Twitter Data in Batch and Real-time

George Wort University of Cambridge 2017

slide-2
SLIDE 2

Naiad

  • Timely Dataflow System.
  • Batch Processing.
  • Stream Processing.
  • Graph Processing.
  • Supports iterative and

incremental data analysis.

  • Low latency.
  • High throughput.
slide-3
SLIDE 3

Naiad

  • Complex system offering a lot of options.
  • Too complex for most applications?
  • Overheads and ease of use?
  • Additions:
  • Differential Dataflow.
  • GraphLINQ.
slide-4
SLIDE 4

Twitter Data Processing

  • Implement real-time and batch processing of tweet stream.
  • Geographically categorise word frequencies.
  • Allow selection of different levels of granularity.
  • Query geographical data.
  • Extend to allow similarity comparison between areas or cluster areas in

batch.

  • Extend to view frequency of spelling mistakes in English.
slide-5
SLIDE 5

Assessment

  • Implement on a single machine and distributed environment.
  • Using:
  • The base Naiad system.
  • Differential dataflow.
  • GraphLINQ.
  • Assessing:
  • Ease of use.
  • Flexibility.
  • Latency.
  • Throughput.
slide-6
SLIDE 6

Questions?