Original author(s) | |
---|---|
Developer(s) | Kubeflow Contributors[1] - AWS, Bloomberg, Google, IBM, NVIDIA, Nutanix, Red Hat, Arrikto, and others |
Initial release | April 5, 2018[2] |
Stable release | 1.8[3]
/ November 1, 2023 |
Repository | github |
Written in | Go, Python |
Platform | Kubernetes |
Type | Machine Learning Platform |
License | Apache License 2.0 |
Website | kubeflow |
Kubeflow is an open-source platform for machine learning and MLOps on Kubernetes introduced by Google. The different stages in a typical machine learning lifecycle are represented with different software components in Kubeflow, including model development (Kubeflow Notebooks[4]), model training (Kubeflow Pipelines,[5] Kubeflow Training Operator[6]), model serving (KServe[lower-alpha 1][7]), and automated machine learning (Katib[8]).
Each component of Kubeflow can be deployed separately, and it is not a requirement to deploy every component.[9]
History
The Kubeflow project was first announced at KubeCon + CloudNativeCon North America 2017 by Google engineers David Aronchick, Jeremy Lewi, and Vishnu Kannan[10] to address a perceived lack of flexible options for building production-ready machine learning systems.[11] The project has also stated it began as a way for Google to open-source how they ran TensorFlow internally.[12]
The first release of Kubeflow (Kubeflow 0.1) was announced at KubeCon + CloudNativeCon Europe 2018[13] with claims of having already become among the top 2% of GitHub projects ever.[14] Kubeflow 1.0 was released in March 2020 via a public blog post announcing that many Kubeflow components were graduating to a "stable status", indicating they were now ready for production usage.[15]
In October 2022, Google announced that the Kubeflow project had applied to join the Cloud Native Computing Foundation.[16][17] In July 2023, the foundation voted to accept Kubeflow as an incubating stage project.[18][19]
Components
Kubeflow Notebooks for model development
Machine learning models are developed in the notebooks component called Kubeflow Notebooks. The component runs web-based development environments inside a Kubernetes cluster, with native support for Jupyter Notebook, Visual Studio Code, and RStudio.[20]
Kubeflow Pipelines for model training
Once developed, models are trained in the Kubeflow Pipelines component. The component acts as a platform for building and deploying portable, scalable machine learning workflows based on Docker containers.[21] Google Cloud Platform has adopted the Kubeflow Pipelines DSL within its Vertex AI Pipelines product.[22]
Kubeflow Training Operator for model training
For certain machine learning models and libraries, the Kubeflow Training Operator component provides Kubernetes custom resources support. The component runs distributed or non-distributed TensorFlow, PyTorch, Apache MXNet, XGBoost, and MPI training jobs on Kubernetes.[6]
KServe for model serving
The KServe component (previously named KFServing[23]) provides Kubernetes custom resources for serving machine learning models on arbitrary frameworks including TensorFlow, XGBoost, scikit-learn, PyTorch, and ONNX.[24] KServe was developed collaboratively by Google, IBM, Bloomberg, NVIDIA, and Seldon.[23] Publicly disclosed adopters of KServe include Bloomberg,[25] Gojek,[26] and others.[27]
Katib for automated machine learning
Lastly, Kubeflow includes a component for automated training and development of machine learning models, the Katib component. It is described as a Kubernetes-native project and features hyperparameter tuning, early stopping, and neural architecture search.[28]
Release timeline
Notes
- ↑ KServe was previously known as KFServing[lower-alpha 2]
References
- ↑ "Kubeflow Website - Working Groups".
- 1 2 "Kubeflow 0.1 - Release Tag". GitHub.
- 1 2 "Kubeflow 1.8 - Release Information".
- ↑ "Kubeflow Website - Kubeflow Notebooks".
- ↑ "Kubeflow Website - Kubeflow Pipelines".
- 1 2 "Kubeflow GitHub - Kubeflow Training Operator". GitHub.
- ↑ "Kubeflow Website - KServe".
- ↑ "Kubeflow Website - Katib".
- ↑ "Kubeflow Website - Installing Kubeflow".
- ↑ ""Hot Dogs or Not" - At Scale with Kubernetes [I] - Vish Kannan & David Aronchick, Google". YouTube.
- ↑ "Introducing Kubeflow - A Composable, Portable, Scalable ML Stack Built for Kubernetes". 21 December 2017.
- ↑ "Kubeflow Website - History".
- ↑ "Google-led Kubeflow, machine learning for Kubernetes, begins to take shape". 4 May 2018.
- ↑ "Announcing Kubeflow 0.1". 4 May 2018.
- ↑ "Kubeflow 1.0: Cloud-Native ML for Everyone". 2 March 2020.
- ↑ Lamkin, Thea (2022-10-24). "Kubeflow has applied to become a CNCF incubating project". Kubeflow. Retrieved 2023-11-02.
- ↑ "Kubeflow applies to become a CNCF incubating project". Google Open Source Blog. 2022-10-24. Retrieved 2023-11-02.
- ↑ "Kubeflow brings MLOps to the CNCF Incubator". Cloud Native Computing Foundation. 2023-07-25. Retrieved 2023-11-02.
- ↑ "Kubeflow joins the CNCF family". Google Open Source Blog. 2023-07-25. Retrieved 2023-11-02.
- ↑ "Kubeflow Website - Kubeflow Notebooks Overview".
- ↑ "Kubeflow Website - Kubeflow Pipelines Introduction".
- ↑ "Vertex AI - Building a pipeline".
- 1 2 "KServe: The next generation of KFServing". 27 September 2021.
- ↑ "KServe GitHub". GitHub.
- ↑ "The journey to build Bloomberg's ML Inference Platform Using KServe (formerly KFServing)". Bloomberg L.p. 12 October 2021.
- ↑ "Merlin: Making ML Model Deployments Magical".
- ↑ "KServe Website - Adopters of KServe".
- ↑ "Kubeflow GitHub - Katib". GitHub.
- ↑ "Kubeflow 0.2 - Release Tag". GitHub.
- ↑ "Kubeflow 0.3 - Release Tag". GitHub.
- ↑ "Kubeflow 0.4 - Release Tag". GitHub.
- ↑ "Kubeflow 0.5 - Release Tag". GitHub.
- ↑ "Kubeflow 0.6 - Release Information".
- ↑ "Kubeflow 0.7 - Release Information".
- ↑ "Kubeflow 1.0 - Release Information".
- ↑ "Kubeflow 1.1 - Release Information".
- ↑ "Kubeflow 1.2 - Release Information".
- ↑ "Kubeflow 1.3 - Release Information".
- ↑ "Kubeflow 1.4 - Release Information".
- ↑ "Kubeflow 1.5 - Release Information".
- ↑ "Kubeflow 1.6 - Release Information".
- ↑ "Kubeflow 1.7 - Release Information".