Eurosys 25 Skyserve

来自大名鼎鼎的UCB Sky Computing Lab 他们尝试在云里运行LLM Serve 然后他们考虑的场景是 Spot inference。这个场景类似于云的instance很吃紧,然后会经常的扩增和缩小。在这种动态场景下做一个能fault tolerance, load balance的一个推理引擎调度系统。 AI也是微服务 Sky serve 首先把LLM服务当成微服务看待。在这种情况

SC 24 Brief Summary 4

总链接: https://www.haibinlaiblog.top/index.php/sc-2024-passage/ Parallel Program Analysis and Code Optimization MCFuser: High-performance and Rapid-fusion of Memory-bound Compute-intensive Operators Aut

SC Paper Reading 3

总链接: https://www.haibinlaiblog.top/index.php/sc-2024-passage/ Paper Computational Efficiency and Learning Techniques Murali Emani B311 AcceleratorsApplications and Application FrameworksArtificial Int

OS Project part I VirtIO, a biref summary

制作基于VirtIO设备驱动 设备驱动需要做什么? 设备初始化 从硬件读取数据,将数据传送进内核 读取内核数据,写入硬件 检测和处理设备错误 Intro: 虚拟化 全虚拟化是指虚拟化软件(VMM)遵循硬件的规范,完整模拟硬件逻辑,这种方式对 guest 操作系统是透明的,即 guest 操作系统不需要做任何修改。全虚拟化模拟的设备与硬件设备对于驱动程序并无不同。全虚拟化的设备性能较低,因为完全按照

OS: Virtual Memory

Author: Haibin Lai 12211612 OS: Virtual Memory - Haibin\'s blog Q1 Address Translation Explain how do the CPU hardware and the operating system cooperate in the procedure of address translation. Ans: T