Towards Efficient and Deterministic Dataflow Systems for Machine Learning

Author(s): Jacky Kwok

Abstract
This thesis brings together two reports that focus on optimizing a dataflow programming language for machine learning workloads using the reactor model. The first paper introduces an efficient parallel reinforcement learning framework that outperforms existing solutions, such as Ray, in simulation throughput, multi-agent inference and training on a single node. The proposed approach achieves this by reducing the work needed for synchronization using the reactor model and decreasing the I/O overhead through optimizing the coordination of Python worker threads. This work has been accepted as a full paper at the 36th ACM Symposium on Parallelism in Algorithms and Architectures. The second paper presents a High-Performance Robotic Middleware (HPRM), which builds on top of the reactor model and employs optimizations including in-memory object store, adaptive serialization, and eager protocol with real-time sockets to ensure low-latency and deterministic communication for autonomous systems. HPRM demonstrates substantial latency reduction compared to the Robot Operating System (ROS) 2 and achieves higher throughput in CARLA autonomous driving applications. The work presented in these two papers contributes to the goal of developing high-performance and reliable systems for machine learning by leveraging the benefits of the reactor model and optimized communication mechanisms.

Electronic Downloads

Citation Formats

  • APA
                    
    Jacky Kwok. (2024). Towards Efficient and Deterministic Dataflow Systems for Machine Learning (Master's thesis), University of California, Berkeley.                       
                    
                    
  • MLA
                    
    Jacky Kwok. "Towards Efficient and Deterministic Dataflow Systems for Machine Learning." Master's thesis, University of California, Berkeley, 2024.                       
                    
                    
  • Chicago
                    
    Jacky Kwok. "Towards Efficient and Deterministic Dataflow Systems for Machine Learning." Master's thesis, University of California, Berkeley, 2024.                       
                    
                    
  • BibTeX
                        
    @thesis{Kwok:24:DeterministicML,
    	author = {Jacky Kwok},
    	title = {Towards Efficient and Deterministic Dataflow Systems for Machine Learning},
    type = {Master's thesis},
    school = {University of California, Berkeley},
    month = {May},
    year = {2024},
    abstract = {This thesis brings together two reports that focus on optimizing a dataflow programming language for machine learning workloads using the reactor model. The first paper introduces an efficient parallel reinforcement learning framework that outperforms existing solutions, such as Ray, in simulation throughput, multi-agent inference and training on a single node. The proposed approach achieves this by reducing the work needed for synchronization using the reactor model and decreasing the I/O overhead through optimizing the coordination of Python worker threads. This work has been accepted as a full paper at the 36th ACM Symposium on Parallelism in Algorithms and Architectures. The second paper presents a High-Performance Robotic Middleware (HPRM), which builds on top of the reactor model and employs optimizations including in-memory object store, adaptive serialization, and eager protocol with real-time sockets to ensure low-latency and deterministic communication for autonomous systems. HPRM demonstrates substantial latency reduction compared to the Robot Operating System (ROS) 2 and achieves higher throughput in CARLA autonomous driving applications. The work presented in these two papers contributes to the goal of developing high-performance and reliable systems for machine learning by leveraging the benefits of the reactor model and optimized communication mechanisms.},
    URL = {https://www2.eecs.berkeley.edu/Pubs/TechRpts/2024/EECS-2024-76.html}}