Large Scale Data Movement Data Movement Patterns o The right - PowerPoint PPT Presentation
Design Patterns for Large Scale Data Movement Data Movement Patterns o The right solution depends on the problem youre solving Real-time or intermittent? Update rates? Any weird networks? Fan-in or fan-out? Acceptable
Design Patterns for Large Scale Data Movement
Data Movement Patterns o The right solution depends on the problem you’re solving ‐ Real-time or intermittent? ‐ Update rates? ‐ Any weird networks? ‐ Fan-in or fan-out? ‐ Acceptable latency? ‐ Payload size? ‐ Humans or machines? ‐ Guarantee required? http://www.dreamstime.com/stock-images-wispy-blue-spirals-pattern-image1983204 2
Latency Required o Some not sensitive at all ‐ Batch updates Required Latency o Seconds often good enough Low as ‐ Database sync Possible ‐ User interfaces Not Critical o Others measure in milli- or micro-seconds ‐ Algo trading ‐ Industrial controls 3
Network Distance o Co Co-lo locatio ion fo for r max sp speed Network Distance ‐ Minimize speed of light Global WAN o LAN for many apps ‐ 10GigE networks Co-location o Long distance WAN ‐ Expensive, limited pipes ‐ Creates mismatches with other networks 4
Number of Messages o Few ‐ Batch updates Number of Messages ‐ Simple applications OMG o Moderate ‐ Risk management Whatever ‐ Order routing o Insane ‐ Market data ‐ Click stream analysis 5
Degree of Distribution o Point-to-point o Fan-out (many subs) o Fan-in (many pubs) o Mesh ‐ Synching data between 1:1 many endpoints Millions of Endpoints Degree of Distribution 6
Message Size o Small ‐ Status updates, activity logging events o Medium ‐ Orders, product BOMs o Large ‐ Batch updates, media files, Small product catalogs o Very different stresses on system based on message Huge size and frequency. Size of Messages 7
Importance of Delivery Guarantee o “Best effort” fine for some scenarios o Others require “ once and only once” o Sequence matters for some Not o Some demand failsafe even in DR scenarios Very Delivery Guarantee Importance 8
Other Considerations o Message o Robustness ‐ Format ‐ Archival ‐ Protocol ‐ Caching ‐ Structured/Unstructured ‐ Acceptable MTBF o Network ‐ HA switchover times ‐ DR requirements ‐ Availability ‐ RTT ‐ Bandwidth cost 9
Combination of Factors Yields Design Patterns o Some attributes tend to Network Distance correlate Required Number of ‐ # of messages and Latency Messages degree of distribution o Others usually contradict ‐ Network distance and latency ‐ Guarantee and latency o Tradeoffs and creative solutions Delivery Degree Guarantee of Distribution Importance Size of Messages 10
Identifying Patterns in Real-World Use Cases Use cases unique, Examples in this section: but patterns emerge Trade Order Flow Manufacturing Data Sync Oil and Gas Monitoring Real Time Sports Betting http://www.dreamstime.com/stock-images-wispy-blue-spirals-pattern-image1983204 11
Order Flow o Latency matters, but Network Distance not every microsecond Required Number of o Usually localized Latency Messages o Continuous, high-rate message flow o Mid-sized messages (1-2Kb) o Messages absolutely must be guaranteed Delivery Degree Guarantee of Distribution Importance Size of Messages 12
Order Flow; Architecture Smart Back Management Order Office & Monitoring Router Applications Slow Subscribers Real Message Bus Time Sync Disaster Client Exchange Recovery Site Gateways Gateways Clients Exchanges 13
Order Flow; Similar Use Cases Need a way to correlate which use case is which color on the chart. o Credit card processing Network Distance ‐ Long-distance WANs Required Number of ‐ latency in hundreds of milliseconds Latency Messages o E-commerce ‐ Higher volumes ‐ Higher guarantee required o Logistics scheduling ‐ Less latency sensitive Delivery Degree Guarantee of Distribution ‐ More likely to include WANs Importance Size of Messages 14
Manufacturing Data Sync Build from the background image on prior slide o Geographically distributed Network Distance o 100% delivery guarantee Required Number of Latency Messages required o Data rate is use case specific – will assume lots of medium (< 5K) messages. o Number of endpoints use case specific, assume 10 manufacturing locations Delivery Degree Guarantee of Distribution Importance Size of Messages 15
Manufacturing Data Sync; Architecture Applications & Databases Fanout at Edge Smart Buffering Maximizing Bandwidth 16
Manufacturing Data Sync; Similar Use Cases o Real Time Risk Management Network Distance ‐ Smaller messages Required Number of ‐ Latency more important Latency Messages o Retail Global Inventory ‐ Messages can be larger ‐ Distribution can be more o Real Time Financials ‐ Messages larger Delivery Degree ‐ Distribution less Guarantee of Distribution Importance (collecting to 1 location) Size of Messages 17
Oil & Gas Pipeline Monitoring o Wifi, Satellite, proprietary and Network Distance other unreliable networks Required Number of o Degree of distribution off the Latency Messages charts. In this case, fan-in. o Messages usually pretty small, unless batch o Latency unimportant o Level of guarantee use case specific, assume status Delivery Degree Guarantee of Distribution messages (ie. guarantee not Importance essential) Size of Messages 18
Oil & Gas Pipeline Monitoring; Architecture Pipeline Sensors Wireless Collection Caches Unreliable Networks Message Bus Big Data Loading Real Time vs. Delayed Analytics Analytics Big Data & Engines Databases 19
Oil & Gas Pipeline Monitoring; Similar Use Cases o Smart Grid Network Distance ‐ Small messages Required Number of ‐ Massive distribution Latency Messages o Transportation Monitoring ‐ Fewer endpoints ‐ Bigger messages o Retail Point of Sale ‐ More predictable networks Delivery Degree ‐ Guarantee more important Guarantee of Distribution Importance Size of Messages 20
Real-Time Sports Betting o Huge message volumes Network Distance (in this case fan-out) Required Number of o Low level of guarantee for any Latency Messages one outbound message o High level of guarantee for inbound messages o Tiny messages o Network is the internet + mobile carriers Delivery Degree Guarantee of Distribution Importance o Latency (beyond network latency) is important Size of Messages 21
Real-Time Sports Betting; Architecture Highlight the degree of fan Mobile Data Clickstream Odds & Customers Streaming out, connection counts, & Marketing Analytics Streaming Big event logging, real time Odds Data Data analysis for odds adjustment Low Huge Connection Latency Counts Message Bus Web Customers Customer & Security & Betting Apps Fraud Detection 22
Real-Time Sports Betting; Similar Use Cases o Mobile Social Updates Network Distance ‐ Latency less important Required Number of ‐ Distribution far greater Latency Messages o Real Time Travel Alerting ‐ Each message more important ‐ Volumes much lower o Market Data Distribution ‐ Latency even more important Delivery Degree Guarantee of Distribution ‐ Volumes often much higher Importance ‐ Loss often tolerable Size of Messages 23
Network Distance Required Number of Latency Messages Delivery Degree of Guarantee Distribution Importance Size of Messages
Summary Questions? 25
Recommend
More recommend
Explore More Topics
Stay informed with curated content and fresh updates.