MultiModal Machine Learning# BLIP: Bootstrapping Language-Image Pre-training 简述 局限性:大多数模型缺乏灵活性,Web 数据嘈杂 解决方案 BLIP的工作原理 模型架构 CapFilt 展望发展 总结 BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language Models