Abstract: Object detection is crucial for ensuring safe autonomous driving. However, data-driven approaches face challenges when encountering minority or novel objects in the 3D driving scene. In this ...
Abstract: Large contrastive vision-language models (VLMs) have recently shown promise in skeleton-based action recognition. However, given the lack of skeleton frame-text training datasets for VLMs, ...