Apache Samza

Apache Samza
Original author(s)	LinkedIn
Developer(s)	Apache Software Foundation
Stable release	1.8.0 / 17 January 2023; 2 years ago
Repository	Samza Repository
Written in	Scala, Java
Operating system	Cross-platform
Type	Distributed stream processing
License	Apache License 2.0
Website	samza.apache.org

Apache Samza is an open-source, near-realtime, asynchronous computational framework for stream processing developed by the Apache Software Foundation in Scala and Java. It has been developed in conjunction with Apache Kafka. Both were originally developed by LinkedIn.^[2]

Overview

Samza allows users to build stateful applications that process data in real-time from multiple sources including Apache Kafka.

Samza provides fault tolerance, isolation and stateful processing. Unlike batch systems such as Apache Hadoop or Apache Spark, it provides continuous computation and output, which result in sub-second^[3] response times.

There are many players in the field of real-time stream processing and Samza is one of the mature products.^[4]^[5]^[6] It was added to Apache in 2013.^[7]

Samza is used by multiple companies.^[8] The biggest installation is in LinkedIn.

References

^ "Announcing the release of Apache Samza 1.8.0". Retrieved 28 March 2024.
^ "How LinkedIn Uses Apache Samza". InfoQ. Retrieved 2016-09-28.
^ "Samza: Stateful Scalable Stream Processing at LinkedIn" (PDF).
^ "Spark Streaming vs Flink vs Storm vs Kafka Streams vs Samza : Choose Your Stream Processing Framework". www.linkedin.com. Retrieved 2019-07-23.
^ "Comparing Apache Spark, Storm, Flink and Samza stream processing engines - Part 1". Scott Logic. Retrieved 2019-07-23.
^ "Hadoop, Storm, Samza, Spark, and Flink: Big Data Frameworks Compared". DigitalOcean. Retrieved 2019-07-23.
^ "Apache Samza". blogs.apache.org. Archived from the original on November 15, 2013. Retrieved 2019-07-23.
^ "Samza - Powered By". samza.apache.org. Retrieved 2019-07-23.

External links

Apache Samza website

[1] "Announcing the release of Apache Samza 1.8.0". Retrieved 28 March 2024.

[2] "How LinkedIn Uses Apache Samza". InfoQ. Retrieved 2016-09-28.

[3] "Samza: Stateful Scalable Stream Processing at LinkedIn" (PDF).

[4] "Spark Streaming vs Flink vs Storm vs Kafka Streams vs Samza : Choose Your Stream Processing Framework". www.linkedin.com. Retrieved 2019-07-23.

[5] "Comparing Apache Spark, Storm, Flink and Samza stream processing engines - Part 1". Scott Logic. Retrieved 2019-07-23.

[6] "Hadoop, Storm, Samza, Spark, and Flink: Big Data Frameworks Compared". DigitalOcean. Retrieved 2019-07-23.

[7] "Apache Samza". blogs.apache.org. Archived from the original on November 15, 2013. Retrieved 2019-07-23.

[8] "Samza - Powered By". samza.apache.org. Retrieved 2019-07-23.

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

v t e The Apache Software Foundation
Top-level projects	Accumulo ActiveMQ Airavata Airflow Allura Ambari Ant Aries Arrow Apache HTTP Server APR Avro Axis Axis2 Beam Bloodhound Brooklyn Calcite Camel CarbonData Cassandra Cayenne CloudStack Cocoon Cordova CouchDB cTAKES CXF Derby Directory Drill Druid Empire-db Felix Flex Flink Flume FreeMarker Geronimo Groovy Guacamole Gump Hadoop HBase Helix Hive Iceberg Ignite Impala Jackrabbit James Jena JMeter Kafka Kudu Kylin Lucene Mahout Maven MINA mod_perl MyFaces Mynewt NiFi NetBeans Nutch NuttX OFBiz Oozie OpenEJB OpenJPA OpenNLP OрenOffice ORC PDFBox Parquet Phoenix POI Pig Pinot Pivot Qpid Roller RocketMQ Samza Shiro SINGA Sling Solr Spark Storm SpamAssassin Struts 1 Subversion Superset SystemDS Tapestry Thrift Tika TinkerPop Tomcat Trafodion Traffic Server UIMA Velocity Wicket Xalan Xerces XMLBeans Yetus ZooKeeper
Commons	BCEL BSF Daemon Jelly Logging
Incubator	Taverna
Other projects	Batik FOP Ivy Log4j
Attic	Apex AxKit Beehive iBATIS Click Continuum Deltacloud Etch Giraph Hama Harmony Jakarta Marmotta MXNet ODE River Shale Slide Sqoop Stanbol Tuscany Wave XML
Licenses	Apache License
Category

Apache Samza

Overview

See also

References

External links