The multi-volume set of LNCS books with volume numbers 15301-15333 constitutes the refereed proceedings of the 27th International Conference on Pattern Recognition, ICPR 2024, held in Kolkata, India, during December 1–5, 2024.
The 963 papers presented in these proceedings were carefully reviewed and selected from a total of 2106 submissions. They deal with topics such as Pattern Recognition; Artificial Intelligence; Machine Learning; Computer Vision; Robot Vision; Machine Vision; Image Processing; Speech Processing; Signal Processing; Video Processing; Biometrics; Human-Computer Interaction (HCI); Document Analysis; Document Recognition; Biomedical Imaging; Bioinformatics.
ODTr: Transformer Integrating OCR Auxiliary Map and Image Depth Information for Document Image Unwarping.- Oracle Character Recognition Based on Attention Enhancement and Multi-level Feature Fusion.- DocHFormer: Document Image Dewarping via Harmonized Modeling of Hierarchical Priors.- Document Image Shadow Removal via Frequency Information-oriented Network.- Improving Online Handwriting Recognition with Transfer Learning Using Out-of-Domain and Different-Dimensional Sources.- ROISER: Towards Real World Semantic Entity Recognition from Visually-rich Documents.- Perception-Enhanced Generative Transformer for Key Information Extraction from Documents.- MuLAD: Multimodal Aggression Detection from Social Media Memes Exploiting Visual and Textual Features.- E4: A Voting-Based Paradigm for Enhancing Retrieval Augmented Generation.- Improving Chinese Emotion Classification based on Bilingual Feature Fusion.- SNOBERT: A Benchmark for clinical notes entity linking in the SNOMED CT clinical terminology.- Enhancing Automated Short Answer Grading with Prompt-Driven Augmentation and Prompt Adaptive Oversampling.- SANS: Spatial-aware Neural Solver for Plane Geometry Problem.- Cross Lingual Synopsis Generation in English, Dutch, Vietnamese, Indonesian, Russian, Portuguese, Korean, Hindi and French.- A Multi-Modal Framework to Counter Hate Speeches.- TBIA-DBNet: A Two-Branch Image-Adaptive DBNet for Scene Text Detection in Real-World Foggy Scenes.- Breaking Boundaries: Enhancing Script Identification Using a Learnable MULLER Resizer.- Arbitrary-Shaped Scene Text Recognition with Deformable Ensemble Attention.- Primary Key Free Watermarking for Numerical Tabular Datasets in Machine Learning.- Offline Handwritten Signature Verification Using a Stream-Based Approach.- OCR4HSV: A Multi-Task Learning Approach for Handwritten Signature Verification.- Learning explicit radical representations for Zero-shot Chinese character recognition.- Deep Learning for Arabic Word Classification: Leveraging Transfer Learning and Grad-CAM for Morphological Analysis.- A cost minimization approach to fix the vocabulary size in a tokenizer for an End-to-End ASR system.- ASD-Diffusion: Anomalous Sound Detection with Diffusion Models.- FCHiFi-GAN: Aggrandizing Fast Convergence with Batchwise Normalization.- Adaptive Enhanced Reversible Flow Model for Remote sensing Image Super Resolution.- Saliency-based Neural Representation for Videos.- HNRC: Lightweight Image Compression with Hybrid Neural Representation.