Group 9: Real-Time Stream Processing
Focus on continuous ingestion and near real-time transformation/query of event data. Azure surfaces: Event Hubs (ingest), Kafka on HDInsight (legacy / OSS alignment), Azure Stream Analytics (managed SQL-like stream processing), and Data Explorer (Kusto) (low-latency analytics & materialized transformations). Complementary patterns often pair ingestion (EH/Kafka) with processing (ASA/Spark/Kusto) and downstream sinks (ADLS, Synapse, Databricks).
Latency Budgeting: Partition overall end-to-end SLO into ingest buffering, transformation, query serving. Choose minimal toolchain that fits within budget while preserving maintainability.
Services & Roles
Key Differences
Service | Primary Role | Strengths | When to Prefer |
---|---|---|---|
Event Hubs | High-scale ingestion | Managed partitions, protocol gateways | Massive telemetry/firehose |
Kafka (HDI) | OSS-compatible broker | Ecosystem plugins | Strict Kafka API parity |
Stream Analytics | Declarative temporal queries | SQL-like windowing | Low-ops rapid streaming jobs |
Data Explorer | Low-latency analytics | KQL, materialized views | Ad-hoc + time-series blend |
Selection Model
0–10 sliders shape emphasis. Scores are weighted linear combinations; lower need for custom code or operations can favor managed services.
Score_EventHubs = 0.24*C_ingest + 0.20*C_throughput + 0.16*C_protocol + 0.14*C_latency + 0.14*C_consumerFanout + 0.12*(10 - C_customBrokers) Score_Kafka = 0.26*C_customBrokers + 0.20*C_protocol + 0.18*C_throughput + 0.14*C_partitionCtrl + 0.12*C_latency + 0.10*(10 - C_opsSimp) Score_StreamAn = 0.26*C_declTransform + 0.20*C_temporal + 0.16*C_latency + 0.14*C_operSimp + 0.14*C_sinkVar + 0.10*(10 - C_customCode) Score_Kusto = 0.25*C_adHoc + 0.20*C_timeSeries + 0.18*C_latency + 0.14*C_material + 0.13*C_scaleQuery + 0.10*(10 - C_customCode)
{{s.name}}: {{s.val | number:2}}
Interpretation
- Event Hubs: Favor when managed ingestion scale & multi-consumer patterns dominate.
- Kafka: Choose for custom broker plugins or strict ecosystem parity.
- Stream Analytics: Rapid SQL-like temporal processing without managing clusters.
- Kusto: Harmonize streaming ingest + powerful time-series & ad-hoc queries.
When NOT to Use
- Ultra-low latency microsecond trading (consider specialized infra).
- Simple nightly batch (big data batch tools cheaper/simpler).
- Single event source with trivial transform (might inline in function).