Moviescc -

Moviescc -

Movie scene classification, deep learning, content-based video analysis, film narrative, clustering, MovieSCC 1. Introduction Cinema is a complex multimodal art form. For decades, film analysis relied on manual annotation by scholars and archivists. With platforms like Netflix, Disney+, and YouTube hosting millions of hours of video, automated scene understanding has become critical for indexing, recommendation, and accessibility.

Existing video classification models (e.g., VideoBERT, TimeSformer) often treat movies as generic video streams, ignoring unique cinematic structures such as shot transitions, pacing, and narrative tropes. fills this gap by focusing specifically on scene-level classification —the atomic narrative unit typically comprising multiple shots unified by time and location. moviescc

[3] Rao, A., et al. (2020). SceneFormer: Inductive bias for video scene segmentation. ECCV . With platforms like Netflix, Disney+, and YouTube hosting

[5] Devlin, J., et al. (2019). BERT: Pre-training of deep bidirectional transformers for language understanding. NAACL . [3] Rao, A