MUSES is a large-scale dataset for temporal event (action) localization. It focuses on the temporal localization of multi-shot events, which are captured with multiple shots. Such events often appear in edited videos, such as TV shows and movies.
What’s included in MUSES:
Paper | Code | Results | Date | Stars |
---|