The multi-volume set of LNCS books with volume numbers 15301-15333 constitutes the refereed proceedings of the 27th International Conference on Pattern Recognition, ICPR 2024, held in Kolkata, India, during December 1–5, 2024. The 963 papers presented in these proceedings were carefully reviewed and selected from a total of 2106 submissions. They deal with topics such as Pattern Recognition; Artificial Intelligence; Machine Learning; Computer Vision; Robot Vision; Machine Vision; Image Processing; Speech Processing; Signal Processing; Video Processing; Biometrics; Human-Computer Interaction (HCI); Document Analysis; Document Recognition; Biomedical Imaging; Bioinformatics.
Edited by:
Apostolos Antonacopoulos, Subhasis Chaudhuri, Rama Chellappa, Cheng-Lin Liu, Saumik Bhattacharya Imprint: Springer International Publishing AG Country of Publication: Switzerland Volume: 15306 Dimensions:
Height: 235mm,
Width: 155mm,
ISBN:9783031781711 ISBN 10: 3031781716 Series:Lecture Notes in Computer Science Pages: 480 Publication Date:03 December 2024 Audience:
Professional and scholarly
,
College/higher education
,
Undergraduate
,
Further / Higher Education
Format:Paperback Publisher's Status: Active
TaylorShift: Shifting the Complexity of Self-Attention from Squared to Linear (and Back) using Taylor-Softmax.- Balancing Accuracy and Efficiency in Budget-Aware Early-Exiting Neural Networks.- An Evolutionary Search-Based Operator Fusion Method with Binary Representation for Deep Learning Inference Acceleration.- SemFaceEdit: Semantic Face Editing on Generative Radiance Manifolds.- (D^2)Styler: Advancing Arbitrary Style Transfer with Discrete Diffusion Methods.- Mask-ControlNet: Higher-Quality Image Generation with An Additional Mask Prompt.- Freestyle 3D-Aware Portrait Synthesis Based on Compositional Generative Priors.- FUGAN: A GAN Based Facial Reconstructor For Accurate Unveiling Of Hidden Faces.- Text2Street: Controllable Text-to-image Generation for Street Views.- Make An Image Move: Few-shot based Video Generation Guided by CLIP.- A Framework For Image Synthesis Using Supervised Contrastive Learning.- TMCSPEECH: A CHINESE TV AND MOVIE SPEECH DATASET WITH CHARACTER DESCRIPTIONS AND A CHARACTER-BASED VOICE GENERATION MODEL.- Deterministic Synthesis of Defect Images using Null Optimization.- Adaptive Refiner based Few-Shot Font Generation.- Controllable 3D object Generation with Single Image Prompt.- Beyond Labels: Aligning Large Language Models with Human-like Reasoning.- HindiLLM: Large Language Model for Hindi.- StableTalk: Advancing Audio-to-Talking Face Generation with Stable Diffusion And Vision Transformer.- Can LLMs perform structured graph reasoning tasks?.- Improved Zero-Shot Image Editing via Null-Toon and Directed Delta Denoising Score.- Texture Spectral Decorrelation Criteria.- A Low Rank Gaussian Mixture Latent Model for Face Generation.- Domain Adaptation for Machinery Fault Diagnosis Based on Critic Classifier GAN.- Data Augmentation Pipeline for Enhanced UAV Surveillance.- Generative Adversarial Networks for Imputing Sparse Learning Performance.- SWave: Improving Vocoder Efficiency by Straightening the Waveform Generation Path.- Outdoor Scene Relighting with Diffusion Models.- Matching aggregate posteriors in the variational autoencoder.- Efficient Nonlinear DAG Learning under Projection Framework.- GCompletor: A Graph-based Deep Learning Method for Traffic State Imputation on Urban Road Networks.