Full Program
Asia Pacific Signal and Information Processing Association Annual Summit and Conference 2022
All conference programs will take place over 3 days on November 8-10, 2022 at Empress Convention Center.
Session | Room | Chair | |
TuAM1-1 (SS13:Advanced Topics on Sound Event and Scene Analysis) | Chiang Mai 1 | Nobutaka Ono, Keisuke Imoto, Tatsuya Komatsu | |
Date | Time | Title | Authors |
8 November 2022 | 10.35-10.55 | On Sorting and Padding Multiple Targets for Sound Event Localization and Detection With Permutation Invariant and Location-Based Training | Robin Scheibler; Tatsuya Komatsu; Yusuke Fujita; Michael Hentschel |
10.55-11.15 | How Information on Acoustic Scenes and Sound Events Mutually Benefits Event Detection and Scene Classification Tasks | Ami Igarashi; Keisuke Imoto; Yuka Komatsu; Shunsuke Tsubaki; Shuto Hario; Tatsuya Komatsu | |
11.15-11.35 | Compressed Sensing of Sparse Spectrum Using Distributed Sound-To-Light Conversion Device Blinkies | Satoshi Motoyama; Natsuki Ueno; Yuma Kinoshita; Nobutaka Ono | |
11.35-11.55 | CochlScene: Acquisition of Acoustic Scene Data Using Crowdsourcing | Il-Young Jeong; Jeongsoo Park | |
11.55-12.15 | Vision Transformer Based Audio Classification Using Patch-Level Feature Fusion | Juan Luo; Jielong Yang; Eng Siong Chng; Xionghu Zhong | |
12.15-12.35 | Self-Consistency Training With Hierarchical Temporal Aggregation for Sound Event Detection | Yunlong Li; Xiujuan Zhu; Mingyu Wang; Ying Hu | |
Session | Room | Chair | |
TuAM1-2 (Speech, Language, and Audio 1) | Chiang Mai 2 | Tomoki Toda | |
Date | Time | Title | Authors |
8 November 2022 | 10.35-10.55 | Music Similarity Calculation of Individual Instrumental Sounds Using Metric Learning | Yuka Hashizume; Li Li; Tomoki Toda |
10.55-11.15 | Investigation of Noise-Reverberation-Robustness of Modulation Spectral Features for Speech-Emotion Recognition | Taiyang Guo; Sixia Li; Masashi Unoki; Shogo Okada | |
11.15-11.35 | Combine Waveform and Spectral Methods for Single-Channel Speech Enhancement | Miao Li; Hui Zhang; Xueliang Zhang | |
11.35-11.55 | Perceptual Loss Function for Speech Enhancement Based on Generative Adversarial Learning | Xin Bai; Xueliang Zhang; Hui Zhang; Haifeng Huang | |
11.55-12.15 | Joint Speech Activity and Overlap Detection With Multi-Exit Architecture | Ziqing Du; Kai Liu; Xucheng Wan; Huan Zhou | |
Session | Room | Chair | |
TuAM1-3 (Human Biometrics and Security Systems) | Chiang Mai 3 | Jessada Karnjana | |
Date | Time | Title | Authors |
8 November 2022 | 10.35-10.55 | On Wrist Vein Recognition for Human Biometrics | Felix Marattukalam; David Cole; Pranav Gulati; Waleed H. Abdulla |
10.55-11.15 | Continuous Authentication on Unconstrained Activities Using Window and Cycle Based Segmentation | Lina Septiana; Narishige Abe; Tomoaki Matsunami; Hidetsugu Uchida; Kazuki Osamura; Shigefumi Yamada | |
11.15-11.35 | Smoothed Teager Energy Cepstral Feature for Replay Attack Detection on Voice Assistants | Madhu R Kamble; Anand Therattil; Hemant A. Patil; M. Ali Basha Shaik; Vikram Vij | |
11.35-11.55 | Disentangled Speaker Representation Learning via Mutual Information Minimization | Sung Hwan Mun; Min Hyun Han; Minchan Kim; Dongjune Lee; Nam Soo Kim | |
11.55-12.15 | Contribution of Timbre and Shimmer Features to Deepfake Speech Detection | Anuwat Chaiwongyen; Norranat Songsriboonsit; Suradej Duangpummet; Jessada Karnjana; Waree Kongprawechnon; Masashi Unoki | |
12.15-12.35 | Combined 2D and 3D Convolution Residual Attention Network for Hand Gesture Recognition | Chang-Ting Tsai; Jian-Jiun Ding | |
10.35-10.55 | On Wrist Vein Recognition for Human Biometrics | Felix Marattukalam; David Cole; Pranav Gulati; Waleed H. Abdulla | |
Session | Room | Chair | |
TuAM1-4 (Signal Image and Information Processing Theory and Methods) | Board Room 2 | Daranee Hormdee | |
Date | Time | Title | Authors |
8 November 2022 | 10.35-10.55 | Investigate Bidirectional Functional Brain Networks Using Directed Information | Qiang Li |
10.55-11.15 | Effective ASR Error Correction Leveraging Phonetic, Semantic Information and N-Best Hypotheses | Hsin-Wei Wang; Bi-Cheng Yan; Yi-Cheng Wang; Berlin Chen | |
11.15-11.35 | A Lossless Audio Codec Based on Hierarchical Residual Prediction | Taiyo Mineo; Hayaru Shouno | |
11.35-11.55 | Investigating Low-Distortion Speech Enhancement With Discrete Cosine Transform Features for Robust Speech Recognition | Yu-Sheng Tsao; Jeih-weih Hung; Kuan-Hsun Ho; Berlin Chen | |
11.55-12.15 | Consistent MDT-Tucker: A Hankel Structure Constrained Tucker Decomposition in Delay Embedded Space | Ryuki Yamamoto; Hidekata Hontani; Akira Imakura; Tatsuya Yokota | |
12.15-12.35 | Sound Reproduction With a Circular Loudspeaker Array Using Differential Beamforming Method | Yankai Zhang; Jiayi Mao; Yefeng Cai; Chao Ye | |
Session | Room | Chair | |
TuAM1-5 (SS01: Reconfigurable Computing and Performance Evaluation) | Board Room 3 | Ukrit Mankong | |
Date | Time | Title | Authors |
8 November 2022 | 10.35-10.55 | Design and System Implementation of a Configurable Optical Interconnection Network | Bowen Yang; Junyong Deng; Jiaying Luo; Yu Feng |
10.55-11.15 | 2S-AGCN Human Behavior Recognition Based on New Partition Strategy | Jin Wu; Lei Wang; Gege Chong; Haoran Feng | |
11.15-11.35 | Design of Optimal FIR Digital Filter by Swarm Optimization Technique | Jin Wu; Yaqiong Gao; Ling Yang; Zhengdong Su | |
11.35-11.55 | Design and Implementation of Reconfigurable Array Structure for Convolutional Neural Network Supporting Data Reuse | Rui Shan; Ziqing Huo; Xiaoshuo Li; Huan Chang; Rui Qin | |
11.55-12.15 | DBR: A Depth-Branch-Resorting Algorithm for Locality Exploration in Graph Processing | Lin Jiang; Ru Feng; Junjie Wang; Junyong Deng | |
12.15-12.35 | Performance Evaluation of Popularity-Aware Dynamic Clustering Scheme for Distributed Caching in ICN | Mikiya Yoshida; Yusuke Ito; Yurino Sato; Hiroyuki Koga | |
Session | Room | Chair | |
TuAM1-6 (SS03: Security Techniques of Speaker Recognition) | Chiang Mai 4 | Xiao-Lei Zhang | |
Date | Time | Title | Authors |
8 November 2022 | 10.35-10.55 | Masking Speech Feature to Detect Adversarial Examples for Speaker Verification | Xing Chen; Jiadi Yao; Xiao-Lei Zhang |
10.55-11.15 | F0 Modification via PV-TSM Algorithm for Speaker Anonymization Across Gender | Candy Olivia Mawalim; Shogo Okada; Masashi Unoki | |
11.15-11.35 | Pay Attention to Hard Trials | Lantian Li; Di Wang; Dong Wang | |
11.35-11.55 | A Multi-Task Framework of Speaker Recognition With TTS Data Augmentation | Xingjia Xie; Yiming Zhi; Beibei Ouyang; Qingyang Hong; Lin Li | |
11.55-12.15 | Source Tracing: Detecting Voice Spoofing | Tinglong Zhu; Xingming Wang; Xiaoyi Qin; Ming Li | |
12.15-12.35 | Replay Attack Detection Based on Voice and Non-Voice Sections for Speaker Verification | Ananda Garin Mills; Patthranit Kaewcharuay; Pannathorn Sathirasattayanon; Suradej Duangpummet; Kasorn Galajit; Jessada Karnjana; Pakinee Aimmanee | |
Session | Room | Chair | |
TuAM1-7 (Speech, Language, and Audio 2) | Chiang Mai 5 | Natthanan Promsuk | |
Date | Time | Title | Authors |
8 November 2022 | 10.35-10.55 | Learning Emotion Information for Expressive Speech Synthesis Using Multi-Resolution Modulation-Filtered Cochleagram | Kaili Zhang; Masashi Unoki |
10.55-11.15 | VocEmb4SVS: Improving Singing Voice Separation With Vocal Embeddings | Chenyi Li; Yi Li; Xuhao Du; Yaolong Ju; Shichao Hu; Zhiyong Wu | |
11.15-11.35 | Dialect-Aware Semi-Supervised Learning for End-To-End Multi-Dialect Speech Recognition | Sayaka Shiota; Ryo Imaizumi; Ryo Masumura; Hitoshi Kiya | |
11.35-11.55 | Design and Construction of Japanese Multimodal Utterance Corpus With Improved Emotion Balance and Naturalness | Daisuke Horii; Akinori Ito; Takashi Nose | |
11.55-12.15 | Non-Parallel Voice Conversion Based on Free-Energy Minimization of Speaker-Conditional Restricted Boltzmann Machine | Takuya Kishida; Toru Nakashika | |
12.15-12.35 | The TNT Team System Descriptions of Cantonese, Mongolian and Kazakh for IARPA OpenASR21 Challenge | Kai Tang; Jing Zhao; Jinghao Yan; Jian Kang; Haoyu Wang; Jinpeng Li; Shuzhou Chai; Guan-Bo Wang; Shen Huang; Guoguo Chen; Pengfei Hu; Wei-Qiang Zhang | |
Session | Room | Chair | |
TuAM1-8 (SS10: Real-world sensing technologies of human function) | Board Room 4 | Yumie Ono/Toshihisa Tanaka | |
Date | Time | Title | Authors |
8 November 2022 | 10.35-10.55 | Evaluation of Cognitive Test Results Using Concentration Estimation From Facial Videos | Terumi Umematsu; Masanori Tsujikawa; Hideyuki Sawada |
10.55-11.15 | Clustering of Advertising Images Using Electroencephalogram | Ingon Chanpornpakdi; Motoi Noda; Toshihisa Tanaka; Yuval Harpaz; Amir B. Geva | |
11.15-11.35 | Evaluation of Influence of Positions and Numbers of EEG Electrodes on Quantification of Independent Component Matrix | Ingon Chanpornpakdi; Ryohei Mizuochi; Maro G Machizawa | |
11.35-11.55 | Wearable Microfluidic Biosensor for Real-Time Sweat Content Monitoring | Hiroyuki Kudo; Yuto Goto | |
11.55-12.15 | Ear-EEG Based Eye State Classification Using Convolutional Neural Network | Chang-Hee Han; Han-Jeong Hwang | |
12.15-12.35 | Development of Virtual-Reality-Based Exergame for Lower-Extremity Rehabilitation of Stroke Patients | Mamiko Sasakawa; Daigo Ito; Ryo Ogura; Takanori Tominaga; Yumie Ono | |
Session | Room | Chair | |
TuPM1-1 ( Speech, Language, and Audio 1) | Chiang Mai 1 | Rohan Kumar Das | |
Date | Time | Title | Authors |
8 November 2022 | 15.20-15.40 | Is Your Baby Fine at Home? Baby Cry Sound Detection in Domestic Environments | Tanmay Khandelwal; Rohan Kumar Das; Eng-Siong Chng |
15.40-16.00 | Acoustic Echo and Noise Canceller Using Shared-Error Normalized Least Mean Square Algorithm | Kenta Iwai; Takanobu Nishiura | |
16.00-16.20 | Subband-Based Spectrogram Fusion for Speech Enhancement by Combining Mapping and Masking Approaches | Hao Shi; Longbiao Wang; Sheng Li; Jianwu Dang; Tatsuya Kawahara | |
16.20-16.40 | Neural Virtual Microphone Estimator: Application to Multi-Talker Reverberant Mixtures | Hanako Segawa; Tsubasa Ochiai; Marc Delcroix; Tomohiro Nakatani; Rintaro Ikeshita; Shoko Araki; Takeshi Yamada; Shoji Makino | |
16.40-17.00 | SE-Mixer: Towards an Efficient Attention-Free Neural Network for Speech Enhancement | Kai Wang; Bengbeng He; Wei-Ping Zhu | |
17.00-17.20 | How Should We Evaluate Synthesized Environmental Sounds | Yuki Okamoto; Keisuke Imoto; Shinnosuke Takamichi; Takahiro Fukumori; Yoichi Yamashita | |
17.20-17.40 | FeatureCut: An Adaptive Data Augmentation for Automated Audio Captioning | Zhongjie Ye; Yuqing Wang; Helin Wang; Dongchao Yang; Yuexian Zou | |
Session | Room | Chair | |
TuPM1-2 (Signal Processing Systems: Design and Implementation) | Chiang Mai 2 | Kasemsit Teeyapan | |
Date | Time | Title | Authors |
8 November 2022 | 15.20-15.40 | Robust Steerable Differential Beamformer for Concentric Circular Array With Directional Microphones | Weilong Huang; Jinwei Feng |
15.40-16.00 | A Deep Proximal-Unfolding Method for Monaural Speech Dereverberation | Meihuang Wang; Minmin Yuan; Andong Li; Chengshi Zheng; Xiaodong Li | |
16.00-16.20 | Speech Enhancement Using Self-Supervised Pre-Trained Model and Vector Quantization | Xiao-Ying Zhao; Qiu-Shi Zhu; Jie Zhang | |
16.20-16.40 | HouseX: A Fine-Grained House Music Dataset and Its Potential in the Music Industry | Xinyu Li | |
16.40-17.00 | Interpretable Control for Emotional Text-To-Speech System Toward Development of Sympathetic Educational-Support Robots | Jingyi Feng; Tomohiro Yoshikawa; Tomoki Toda | |
17.00-17.20 | Direction-Aware Target Speaker Extraction With a Dual-Channel System Based on Conditional Variational Autoencoders Under Underdetermined Conditions | Rui Wang; Li Li; Tomoki Toda | |
17.20-17.40 | LCN: Label Correction Based on Network Prediction for Cross-Modal Retrieval With Noisy Labels | Daiki Okamura; Ryosuke Harakawa; Masahiro Iwahashi | |
Session | Room | Chair | |
TuPM1-3 (Signal Image and Information Processing Theory and Methods) | Chiang Mai 3 | Tatsuya Yokota | |
Date | Time | Title | Authors |
8 November 2022 | 15.20-15.40 | Using Self-Learning Representations for Objective Assessment of Patient Voice in Dysphonia | Shaoxiang Dang; Tetsuya Matsumoto; Yoshinori Takeuchi; Hiroaki Kudo; Takashi Tsuboi; Yasuhiro Tanaka; Masahisa Katsuno |
15.40-16.00 | Fast Signal Completion Algorithm With Cyclic Convolutional Smoothing | Hiromu Takayama; Tatsuya Yokota | |
16.00-16.20 | Single-Channel Speech Enhancement Student Under Multi-Channel Speech Enhancement Teacher | Yuzhu Zhang; Hui Zhang; Xueliang Zhang | |
16.20-16.40 | Distance-Based Dynamic Weight: A Novel Framework for Multi-Source Information Fusion | Cuiping Cheng; Xiaoning Zhang; Taihao Li | |
16.40-17.00 | Improvement of the Direction-Of-Arrival Estimation Method Using a Single Channel Microphone by Correcting a Spectral Slope of Speech | Masaki Ikeuchi; Hiroki Tanji; Takahiro Murakami | |
17.00-17.20 | Studying Human-Based Speaker Diarization and Comparing to State-Of-The-Art Systems | Simon W. McKnight; Aidan O. T. Hogg; Vincent W. Neo; Patrick A. Naylor | |
17.20-17.40 | Optimization of CU Partition Based on Texture Degree in H.266/VVC | Jingyuan Tang; Songlin Sun | |
Session | Room | Chair | |
TuPM1-4 (SS02: Deep Learning Systems and Applications for Cloud, Fog, and Edge) | Board Room 2 | Jia-Ching Wang | |
Date | Time | Title | Authors |
8 November 2022 | 15.20-15.40 | Selection of Supplementary Acoustic Data for Meta-Learning in Under-Resourced Speech Recognition | I-Ting Hsieh; Chung-Hsien Wu; Zhe-Hong Zhao |
15.40-16.00 | Using Prosodic Phrase-Based VQVAE on Audio ALBERT for Speech Emotion Recognition | Jia-Hao Hsu; Chung-Hsien Wu; Tsung-Hsien Yang | |
16.00-16.20 | ESPnet-ONNX: Bridging a Gap Between Research and Production | Masao Someki; Yosuke Higuchi; Tomoki Hayashi; Shinji Watanabe | |
16.20-16.40 | Multi-Loss Function in Robust Convolutional Autoencoder for Reconstruction Low-Quality Fingerprint Image | Farchan Hakim Raswa; Franki Halberd; Agus Harjoko; Wahyono; Chung-Ting Lee; Yung-Hui Li; Jia Ching Wang | |
Session | Room | Chair | |
TuPM1-5 (Research Review) | Board Room 3 | Jesin James | |
Date | Time | Title | Authors |
8 November 2022 | 15.20-15.40 | EmotionGUI: Visualisation and Annotation of Emotions in a 2D Space for Multi-Modal Signals | Jesin James; Felix Marattukalam; Owen Eng; Aron Jeremiah |
15.40-16.00 | Enhancing the Performance of Automatic Speech Recognition With Optical Microphone Technology Through Data Augmentation Approach: A Pilot Study | Ruei-Ci Shen; Ji-Yan Han; Ying-Hui Lai | |
16.00-16.20 | Process Monitoring Based on Nearest Correlation and Variational Graph Auto-Encoder and Its Application to Tennessee Eastman Process | Yoshiaki Uchida; Koichi Fujiwara | |
16.20-16.40 | Decoding of Individual Emotions Induced During Interaction With Voice-User Interface Using Electroencephalography | Jun-Seok Lee, Ga-Young Choi, Ji-Yoon Lee, Jong-Gyu Shin, Sang-Ho Kim, Han-Jeong Hwang | |
16.40-17.00 | Leverage Limited Features of Partial Fingerprint Recognition Using Improved Siamese Network With Self-Spatial Attention | Farchan Hakim Raswa, Franki Halberd, Agus Harjoko, Chung-Ting Lee, Yung-Hui Li, Pao-Chi Chang, Jia-Ching Wang | |
17.00-17.20 | Design and Signal Analysis of a Compact Antenna for UWB MIMO Systems | Long Jin; Yangmiao Lin; Iickho Song; Ruohan Zhang | |
17.20-17.40 | A Filtered-x Active Noise Control Algorithm Robust to Impulsive Noise Using Novel Subband Adaptive Filter Algorithm | Chan Park; Minho Lee; PooGyeon Park | |
Session | Room | Chair | |
TuPM1-6 (Speech, Language, and Audio 2) | Chiang Mai 4 | Christian H Ritz | |
Date | Time | Title | Authors |
8 November 2022 | 15.20-15.40 | Neural Conversational Speech Synthesis With Flexible Control of Emotion Dimensions | Hiroki Mori; Hironao Nishino |
15.40-16.00 | Temporal Feedback Convolutional Recurrent Neural Networks for Speech Command Recognition | Taejun Kim; Juhan Nam | |
16.20-16.40 | Impact of Compression on the Performance of the Room Impulse Response Interpolation Approach to Spatial Audio Synthesis | Hualin Ren; Christian Ritz; Jiahong Zhao; Daeyoung Jang | |
16.40-17.00 | Machine Anomalous Sound Detection Based on Self-Supervised Classification | Shuxian Wang; Jun Du; Yajian Wang | |
17.00-17.20 | A Study on Low-Latency Recognition-Synthesis-Based Any-To-One Voice Conversion | Yi-Yang Ding; Li-Juan Liu; Yu Hu; Zhen-Hua Ling | |
17.20-17.40 | Speech Enhancement With Perceptually-Motivated Optimization and Dual Transformations | Xucheng Wan; Kai Liu; Ziqing Du; Huan Zhou | |
Session | Room | Chair | |
TuPM1-7 (SS12: Advanced signal detection and inspection technology) | Chiang Mai 5 | Settha Tangkawanit | |
Date | Time | Title | Authors |
8 November 2022 | 15.20-15.40 | Automatic Sound Detection and Notification System Using MFCC | Jaruwat Patmanee; Prapatson Kotipang; Pawarisorn Sinpeang; Surachet Kanprachar; Settha Tangkawanit |
15.40-16.00 | Sound Identification Using MFCC With Machine Learning | Pattarapong Kammee; Chairat Pinthong; Surachet Kanprachar; Settha Tangkawanit | |
16.20-16.40 | Direct-Lattice Adaptive Notch Filter for Frequency Estimation and Tracking | Prayuth Inban; Rachu Punchalard; Chawalit Benjangkaprasert | |
16.40-17.00 | Distance Estimation Between Camera and Vehicles From an Image Using YOLO and Machine Learning | Rattapoom Waranusast; Panomkhawn Riyamongkol; Pattanawadee Pattanathaburt | |
17.00-17.20 | OCR Application for Cancer Care | Settha Tangkawanit; Jiraporn Pooksook; Jirarat Ieamsaard; Panupong Sornkhom | |
17.20-17.40 | The Development of Mobile Application for Assisting COVID-19 Antigen Test Kit Results Reading | Rattapoom Waranusast; Pattanawadee Pattanathaburt | |
17.40 - 18.00 | Matched Filter Detector for Textile Fiber Classification of Signals With Near-Infrared Spectrum | Suchart Yammen; Wachira Limsripraphan | |
Session | Room | Chair | |
WedAM1-1 (SS11: Transfer Learning for Real World) | Chiang Mai 1 | Xiaoxu Li/ Dome Potikanond | |
Date | Time | Title | Authors |
9 November 2022 | 9.00-9.20 | Semantics-Guided Knowledge Integration for Domain Adaptation Few-Shot Relation Extraction | Zeyuan Wang; Yifan Du; Guangwei Zhang; Ruifan Li; Yongping Xiong; Chuang Zhang |
9.20-9.40 | PVGCRA: Prediction Variance Guided Cross Region Domain Adaptation | Ran Xu; Yixiang Huang; Chuang Zhang | |
9.40-10.00 | Multi-Branch Network for Few-Shot Learning | Kai Ren; Zijie Guo; Zhimin Zhang; Rui Zhu; Xiaoxu Li | |
10.00-10.20 | Few-Shot Classification With Feature Reconstruction Bias | Zhen Li; Lang Wang; Shuo Ding; Xiaochen Yang; Xiaoxu Li | |
10.20-10.40 | Dual Prototypical Network for Robust Few-Shot Image Classification | Qi Song; Zebin Peng; Luchen Ji; Xiaochen Yang; Xiaoxu Li | |
10.40-11.00 | Graph Evolving and Embedding in Transformer | Jen-Tzung Chien; Chia-Wei Tsao | |
Session | Room | Chair | |
WedAM1-2 (Speech, Language, and Audio 1) | Chiang Mai 2 | Xiaofen Xing | |
Date | Time | Title | Authors |
9 November 2022 | 9.00-9.20 | Punctuation Restoration for Singaporean Spoken Languages: English, Malay, and Mandarin | Abhinav Rao; Ho Thi-Nga; Chng Eng Siong |
9.20-9.40 | C-CycleTransGAN: A Non-Parallel Controllable Cross-Gender Voice Conversion Model With CycleGAN and Transformer | Changzeng Fu; Chaoran Liu; Carlos Toshinori Ishi; Hiroshi Ishiguro | |
9.40-10.00 | The Realization and Perception of Narrow Focus in English Sentences by Cantonese EFL Learners | Chong Cao; Aijun Li | |
10.00-10.20 | Cross-Lingual Dysarthria Severity Classification for English, Korean, and Tamil | Eun Jung Yeo; Kwanghee Choi; Sunhee Kim; Minhwa Chung | |
10.20-10.40 | 3M: An Effective Multi-View, Multi-Granularity, and Multi-Aspect Modeling Approach to English Pronunciation Assessment | Fu-An Chao; Tien-Hong Lo; Tzu-I Wu; Yao-Ting Sung; Berlin Chen | |
10.40-11.00 | I Feel Stressed Out: A Mandarin Speech Stress Dataset With New Paradigm | Shuaiqi Chen; Xiaofen Xing; Guodong Liang; Xiangmin Xu | |
Session | Room | Chair | |
WedAM1-3 ( Deep Learning: Algorithm, Implementations, and Applications) | Chiang Mai 3 | Hiroyoshi Ito | |
Date | Time | Title | Authors |
9 November 2022 | 9.00-9.20 | End-To-End Reinforcement Learning of Robotic Manipulation With Robust Keypoints Representation | Tianying Wang; En Yen Puang; Marcus Lee; Wei Jing; Yan Wu |
9.20-9.40 | BEAM - an Algorithm for Detecting Phishing Link | Sea Ran Cleon Liew; Ngai Fong Law | |
9.40-10.00 | I2CR: Improving Noise Robustness on Keyword Spotting Using Inter-Intra Contrastive Regularization | Dianwen Ng; Jia Qi Yip; Tanmay Surana; Zhao Yang; Chong Zhang; Yukun Ma; Chongjia Ni; Eng Siong Chng; Bin Ma | |
10.00-10.20 | Human-In-The-Loop Chord Progression Generator With Generative Adversarial Network | Yoshiteru Matsumoto; Hiroyoshi Ito; Hiroko Terasawa; Yuya Yamamoto; Yuzuru Hiraga; Masaki Matsubara | |
10.20-10.40 | A Resource-Limited FPGA-Based MobileNetV3 Accelerator | Yutana Jewajinda; Thanapol Thongkum | |
10.40-11.00 | CG-Net: A Compound Gaussian Prior Based Unrolled Imaging Network | Carter A Lyons; Raghu G. Raj; Margaret Cheney | |
Session | Room | Chair | |
WedAM1-4 (Signal Image and Information Processing Theory and Methods) | Board Room 2 | Mingyi He | |
Date | Time | Title | Authors |
9 November 2022 | 9.00-9.20 | A Policy-Based Approach to the SpecAugment Method for Low Resource E2E ASR | Rui Li; Guodong Ma; Dexin Zhao; Ranran Zeng; Xiaoyu Li; Hao Huang |
9.20-9.40 | Manifold Rewiring for Unlabeled Imaging | Valentin Debarnot; Vinith Kishore; Cheng Shi; Ivan Dokmanic | |
9.40-10.00 | CRDet: An Object-Context-Aware Detection Network for Oriented Object in Aerial Images | Lele Liang; Linghan Li; Qi Liu; Yuchao Dai; Mingyi He | |
10.00-10.20 | Effects of Incorporating a Deep-Unfolding Framework Into a Deep Neural Network: Implications for Image Restoration | Tatsuki Itasaka; Masahiro Okuda | |
10.20-10.40 | Cross-Modal Knowledge Distillation With Dropout-Based Confidence | Won Ik Cho; Jeunghun Kim; Nam Soo Kim | |
10.40-11.00 | A Multi-Objective Perceptual Aware Loss Function for End-To-End Target Speaker Separation | Zhan Jin; Bang Zeng; Fan Zhang | |
Session | Room | Chair | |
WedAM1-5 (Research Review) | Board Room 3 | Ying-Hui Lai | |
Date | Time | Title | Authors |
9 November 2022 | 9.00-9.20 | EEG-Based Anomaly Detection Model by One-Class Support Vector Machine for Dream Enactment Behavior in REM Sleep Behavior Disorder | Shumpei Date, Koichi Fujiwara, Yukiyoshi Sumi, Hiroshi Kadotani, Makoto Imai, Keiko Ogawa |
9.20-9.40 | Development of Heat Stroke Detection Model Based on Heart Rate Variability Using LSTM-AutoEncoder | Shota Saeda, Koshi Ota, Koichi Fujiwara, Takatomi Kubo, Toshitaka Yamakawa, Aozora Yamamoto, Yuki Maruno, Manabu Kano | |
9.40-10.00 | Driving Fitness Evaluation Model for Patients With Schizophrenia Based on Driving Data of Healthy Participants and Random Forest | Shuji Tsunoda, Koichi Fujiwara, Seiko Miyata, Akiko Yamaguchi, Shogo Kitagawa, Yuki Konishi, Reiji Yoshimura, Isao Taguchi, Yutaka Sawa, Kunihiro Iwamoto, Norio Ozaki | |
10.00-10.20 | Method for Estimating Test Contrast Peak Time in Computed Tomography Angiography | Toshihide Otsuki; Kazuto Sakamoto; Homare Saisho; Hiroyoshi Yokoi; Toshitaka Yamakawa | |
10.20-10.40 | Development of an Epileptic Seizure Prediction Algorithm Based on R-R Intervals With Temporal Convolutional Networks | Rikumo Ode; Koichi Fujiwara; Miho Miyajima; Toshitaka Yamakawa; Manabu Kano; Taketoshi Maehara | |
Session | Room | Chair | |
WedAM1-6 (SS17: Emerging Diseases and Smart Image Processing) | Chiang Mai 4 | Krisana Chinnasarn | |
Date | Time | Title | Authors |
9 November 2022 | 9.00-9.20 | Pre-Processing SARS-CoV-2 Sequence Data for Application of Machine Learning Techniques for Visualization and Clustering of Virus Characteristics | Juhyeon Kim; Insung Ahn |
9.20-9.40 | Educational Multi-Purpose Kit for Coding and Robotic Design | Atikhun Thongpool; Daranee Hormdee; Raksit Chutipakdeevong; Wasan Tansakul; | |
9.40-10.00 | Forecasting Dengue Fever in France and Thailand Using XGBoost | Thanin Methiyothin; Insung Ahn | |
10.00-10.20 | Fine-Tuning BERT for Question and Answering Using PubMed Abstract Dataset | Saeyeon Cheon; Insung Ahn | |
10.20-10.40 | Coarse X-Ray Lumbar Vertebrae Pose Localization Using Triangulation Correspondence | Watcharaphong Yookwan; Jiranun Sangrueng; Krisana Chinnasarn | |
10.40-11.00 | 4G Signal RSSI Recommendation System for ISP Quality of Service Improvement | Tanatpon Duangta; Watcharaphong Yookwan; Krisana Chinnasarn; Anuparp Boonsongsrikul | |
Session | Room | Chair | |
WedAM1-7 (Speech, Language, and Audio 2) | Chiang Mai 5 | Wei-Ping Zhu | |
Date | Time | Title | Authors |
9 November 2022 | 9.00-9.20 | SE-DPTUNet: Dual-Path Transformer Based U-Net for Speech Enhancement | Bengbeng He; Kai Wang; Wei-Ping Zhu |
9.20-9.40 | Encoder Re-Training With Mixture Signals on FastMVAE Method | Shuhei Yamaji; Taishi Nakashima; Nobutaka Ono; Li Li; Hirokazu Kameoka | |
9.40-10.00 | Unsupervised Disentanglement of Timbral, Pitch, and Variation Features From Musical Instrument Sounds With Random Perturbation | Keitaro Tanaka; Yoshiaki Bando; Kazuyoshi Yoshii; Shigeo Morishima | |
10.00-10.20 | Estimation of Transfer Coefficients and Signals of Sound-To-Light Conversion Device Blinky Under Saturation | Kosuke Nishida; Natsuki Ueno; Yuma Kinoshita; Nobutaka Ono | |
10.20-10.40 | Design and Evaluation of Instrument Sound Identification Difficulty for the Deaf and Hard-Of Hearing | Shiho Akaki; Rumi Hiraga; Keiichi Yasu; Keiji Tabuchi; Hiroko Terasawa | |
10.40-11.00 | Correcting, Rescoring and Matching: An N-Best List Selection Framework for Speech Recognition | Chin-Hung Kuo; Kuan-Yu Chen | |
Session | Room | Chair | |
WedAM1-8 (SS04: Advanced Signal Processing and Machine Learning for Audio and Speech Applications) | Board Room 4 | Shoji Makino | |
Date | Time | Title | Authors |
9 November 2022 | 9.00-9.20 | Hyperbolic Timbre Embedding for Musical Instrument Sound Synthesis Based on Variational Autoencoders | Futa Nakashima; Tomohiko Nakamura; Norihiro Takamune; Satoru Fukayama; Hiroshi Saruwatari |
9.20-9.40 | Multi-Task Adversarial Training Algorithm for Multi-Speaker Neural Text-To-Speech | Yusuke Nakai; Yuki Saito; Kenta Udagawa; Hiroshi Saruwatari | |
9.40-10.00 | Inverse-Free Online Independent Vector Analysis With Flexible Iterative Source Steering | Taishi Nakashima; Nobutaka Ono | |
10.00-10.20 | Accelerating online algorithm using geometrically constrained independent vector analysis with iterative source steering | Kana Goto; Tetsuya Ueda; Li Li; Takeshi Yamada; Shoji Makino | |
10.20-10.40 | A Dilated Inception Convolutional Neural Network for Gridless DOA Estimation Under Low SNR Scenarios | Zhi-Wei Tan; Yuan Liu; Andy W. H. Khong | |
10.40-11.00 | Efficient Low-Latency Convolution With Uniform Filter Partition and Its Evaluation on Real-Time Blind Source Separation | Yui Kuriki; Taishi Nakashima; Kouei Yamaoka; Natsuki Ueno; Yukoh Wakabayashi; Nobutaka Ono; Ryo Sato | |
Session | Room | Chair | |
WedPM1-1 (SS05: Advanced Image and Video Processing using Deep Learning) | Chiang Mai 1 | Chul Lee | |
Date | Time | Title | Authors |
9 November 2022 | 14.00-14.20 | Object Segmentation Using Parametric Representation | Hochang Rhee; Hyung Il Koo; Nam Ik Cho |
14.20-14.40 | Deep Color Constancy Using Multi-Band NIR | Jeong-Won Ha; Dong-keun Han; Min-Je Park; Jong-Ok Kim | |
14.40-15.00 | Smooth Panoramic Walkthrough for Adjacent Panoramic Viewpoints With Dense Spherical Matching Points | Kyungjune Lee; Mingyu Jang; Sanghoon Lee; Kim Taewan | |
15.00-15.20 | Region Adaptive Self-Attention for an Accurate Facial Emotion Recognition | Seongmin Lee; Jeonghaeng Lee; Minsik Kim; Sanghoon Lee | |
15.20-15.40 | Quality Enhancement of Screen Content Video Using Dual-Input CNN | Ziyin Huang; Yue Cao; Sik-Ho Tsang; Yui-Lam Chan; Kin-Man Lam | |
15.40-16.00 | Underwater Image Enhancement Using Realistic Dataset With Turbidity and Color Distortion | Eunpil Park; Eunsung Jo; Jae-Young Sim | |
Session | Room | Chair | |
WedPM1-2 (Speech, Language, and Audio 1) | Chiang Mai 2 | Ashish Panda | |
Date | Time | Title | Authors |
9 November 2022 | 14.00-14.20 | Neural Vocoder Feature Estimation for Dry Singing Voice Separation | Jaekwon Im; Soonbeom Choi; Sangeon Yong; Juhan Nam |
14.20-14.40 | Adapting GCC-PHAT to Co-Prime Circular Microphone Arrays for Speech Direction of Arrival Estimation Using Neural Networks | Jiahong Zhao; Christian Ritz | |
14.40-15.00 | A Novel Approach to Structured Pruning of Neural Network for Designing Compact Audio-Visual Wake Word Spotting System | Haotian Wang; Jun Du; Hengshun Zhou; Heng Lu; Yuhang Cao | |
15.00-15.20 | Hierarchic Temporal Convolutional Network With Attention Fusion for Target Speaker Extraction | Zihao Chen; Wenbo Qiu; Haitao Xu; Ying Hu | |
15.20-15.40 | Acoustic Model Adaption Using x-Vectors for Improved Automatic Speech Recognition | Meet Soni; Aditya Raikar; Ashish Panda; Sunil Kumar Kopparapu | |
15.40-16.00 | Acoustic Pornography Recognition Using Convolutional Neural Networks and Bag of Refinements | Lifeng Zhou; Kaifeng Wei; Yuke Li; Yiya Hao; Weiqiang Yang; Haoqi Zhu | |
Session | Room | Chair | |
WedPM1-3 ( Deep Learning: Algorithm, Implementations, and Applications) | Chiang Mai 3 | Jen-Tzung Chien | |
Date | Time | Title | Authors |
9 November 2022 | 14.00-14.20 | An Optimal Vehicle Counting Framework for Non-Canonical CCTV Placements | Ng Chin Hooi; Edwin Tan Chee Pin; Chiew Yeong Shiong; Lim Mei Kuan |
14.20-14.40 | Response Sentence Modification Using a Sentence Vector for a Flexible Response Generation of Retrieval-Based Dialogue Systems | Ryota Yahagi; Akinori Ito; Takashi Nose; Yuya Chiba | |
14.40-15.00 | End-To-End Stereo Audio Coding Using Deep Neural Networks | Wootaek Lim; Inseon Jang; Seungkwon Beack; Jongmo Sung; Taejin Lee | |
15.00-15.20 | Neural Beamformer With Automatic Detection of Notable Sounds for Acoustic Scene Classification | Sota Ichikawa; Takeshi Yamada; Shoji Makino | |
15.20-15.40 | DNN-Based Frequency-Domain Permutation Solver for Multichannel Audio Source Separation | Fumiya Hasuike; Daichi Kitamura; Rui Watanabe | |
15.40-16.00 | Detection Method From 4K Images Using SSD300 Without Retraining | Kei Irie; Kiyoshi Nishikawa | |
Session | Room | Chair | |
WedPM1-4 (Signal Image and Information Processing Theory and Methods) | Board Room 2 | Zhang Ke | |
Date | Time | Title | Authors |
9 November 2022 | 14.00-14.20 | PAformer: Visually Indistinguishable Bolt Defect Recognition Based on Bolt Position and Attributes | Wenshuo Lou; Ke Zhang; Yangjie Xiao; Xiwang Guo; Jiacun Wang |
14.20-14.40 | Adapted Spectrogram Transformer for Unsupervised Cross-Domain Acoustic Anomaly Detection | Gilles Van De Vyver; Zhaoyi Liu; Koustabh Dolui; Danny Hughes; Sam Michiels | |
14.40-15.00 | A Two-Stage Cascading Method Based on Finetuning in Semi-Supervised Domain Adaptation Semantic Segmentation | Huiying Chang; Kaixin Chen; Ming Wu | |
15.00-15.20 | Landmark Management in the Application of Radar SLAM | Shuai Sun; Beth Jelfs; Kamran Ghorbani; Glenn I. Matthews; Chris Gilliam | |
15.20-15.40 | Parameterization of Dominant Spectral Peak Trajectory for Whisper Speech Recognition | Chang Feng; Xiaolong Wu; Mingxing Xu; Thomas Fang Zheng | |
15.40-16.00 | Specific Emitter Identification at Different Time Based on Multi-Domain Migration | Jiaxu Liu; Jianqing Li; Jiao Wang; Hao Huang | |
Session | Room | Chair | |
WedPM1-5 (Research Review) | Board Room 3 | Koichi Fujiwara | |
Date | Time | Title | Authors |
9 November 2022 | 14.00-14.20 | Long-Term Prognostic Prediction of West Syndrome Based on Scalp EEG Using Convolution Neural Network Autoencoder | Tatsuki Saito; Koichi Fujiwara; Jun Natsume; Ryosuke Suzui |
14.20-14.40 | Modification of RRI Data by NBEATS Model | Hongtao Chen, Koichi Fujiwara, Manabu Kano | |
14.40-15.00 | Transformer With Noise Divider | Mun-Hyung Lee, Seon-Woo Lee, Jung-Mu Choi, Jang-Woo Kwon | |
15.00-15.20 | Schizophrenia Classification Based on the Natural Language Processing Technology-A Pilot Study | Ying Hsuan Chen; Pei-Yun Lin; Tsung-Tse Ho; Yuh-Jer Chang; Ying-Hui Lai | |
15.20-15.40 | Signed Graph Balancing Based on Spectral Clustering | Haruki Yokota, Junya Hara, Yuichi Tanaka | |
15.40-16.00 | Graph Signal Sampling for Multiple Generator Functions | Junya Hara; Yuichi Tanaka | |
Session | Room | Chair | |
WedPM1-6 (Signal Proceesing for Audio and Speech Applications) | Chiang Mai 4 | Tomoyosi Akiba | |
Date | Time | Title | Authors |
9 November 2022 | 14.00-14.20 | Semi-Supervised ASR Based on Iterative Joint Training With Discrete Speech Synthesis | Keiya Takagi; Tomoyosi Akiba; Hajime Tsukada |
14.20-14.40 | Analysis of Amplitude and Frequency Perturbation in the Voice for Fake Audio Detection | Kai Li; Yao Wang; Minh Le Nguyen; Masato Akagi; Masashi Unoki | |
14.40-15.00 | Deep Hashing for Speaker Identification and Retrieval Based on Auditory Sparse Representation | Dung Kim Tran; Masato Akagi ; Masashi Unoki | |
15.00-15.20 | Divide and Conquer: A Low-Complexity Neural Network for Monophonic Speech Enhancement | Bingxiao Fang; Liang Liu | |
15.20-15.40 | Domain Adaptation and Language Conditioning to Improve Phonetic Posteriorgram Based Cross-Lingual Voice Conversion | Pin-Chieh Hsu; Nobuaki Minematsu; Daisuke Saito | |
15.40-16.00 | Von Mises Mixture Model-Based DNN for Sign Indetermination Problem in Phase Reconstruction | Nguyen Binh Thien; Yukoh Wakabayashi; Geng Yuting; Kenta Iwai; Takanobu Nishiura | |
Session | Room | Chair | |
WedPM1-7 (Speech, Language, and Audio 2) | Chiang Mai 5 | Daranee Hormdee | |
Date | Time | Title | Authors |
9 November 2022 | 14.00-14.20 | Speaker Representation Learning via Contrastive Loss With Maximal Speaker Separability | Zhe Li; Man Wai Mak |
14.20-14.40 | Design of Discriminators in GAN-Based Unsupervised Learning of Neural Post-Processors for Suppressing Localized Spectral Distortion | Riku Ogino; Kohei Saijo; Tetsuji Ogawa | |
14.40-15.00 | Simultaneous Frequency Estimation for Three or More Sinusoids Based on Sinusoidal Constraint Differential Equation | Kenta Yamada, Yoshiki Masuyama, Yukoh Wakabayashi, Nobutaka Ono | |
15.00-15.20 | Do You Know How Humans Sound? Exploring a Qualification Test Design for Crowdsourced Evaluation of Voice Synthesis Quality | Moe Yaegashi; Susumu Saito; Teppei Nakano; Tetsuji Ogawa | |
15.20-15.40 | Exploring the Gender Difference on Mandarin Tone Realization in Lombard Speech | Weizhong Zhang; Jian Gong; Kai Sheng; Yuhong Sun; William Bellamy; Xiaoli Ji | |
Session | Room | Chair | |
WedPM1-8 (Data Analytics and Machine Learning) | Board Room 4 | Chern Hong Lim | |
Date | Time | Title | Authors |
9 November 2022 | 14.00-14.20 | Improving Co-SVD for Cold-Start Recommendations Using Sparsity Reduction | Low Jia Ming; Chern Hong Lim; Ian K. T. Tan |
14.20-14.40 | Epoch-Wise Double Descent Triggered by Learning a Single Sample | Aoshi Kawaguchi; Hiroshi Kera; Toshihiko Yamasaki | |
14.40-15.00 | Current Source Localization Using Deep Prior With Depth Weighting | Hajime Yano; Rio Yamana; Ryoichi Takashima; Tetsuya Takiguchi; Seiji Nakagawa | |
15.00-15.20 | A Proposal for Emotion-Expressive Editor:EmoEditor by Font Changing | Yuki Shimamura; Michiharu Niimi | |
15.20-15.40 | Traceback Memory Reduction for Three-Sequence Alignment Algorithm With Affine Gap Models | Rui-Ting Chien; Mao-Jan Lin; Yang-Ming Yeh; Yi-Chang Lu | |
15.40-16.00 | Acceleration of Subspace Learning Machine via Particle Swarm Optimization and Parallel Processing | Hongyu Fu; Yijing Yang; Yuhuai Liu; Joseph Lin; Ethan Harrison; Vinod K. Mishra; C.-C. Jay Kuo | |
Session | Room | Chair | |
WedPM2-1 (SS05: Advanced Image and Video Processing using Deep Learning) | Chiang Mai 1 | Chul Lee | |
Date | Time | Title | Authors |
9 November 2022 | 16.20-16.40 | Enhanced Bidirectional Motion Estimation Using Feature Refinement for HDR Imaging | An Gia Vien; Truong Thanh Nhat Mai; Seonghyun Park; Gahyeon Kim; Chul Lee |
16.40-17.00 | Fast Asymmetric Bilateral Motion Estimation for Video Frame Interpolation | Jintae Kim; Junheum Park; Chang-Su Kim | |
17.00-17.20 | Future Object Localization in Autonomous Driving Using Ego-Centric Images and Motions | Seoyoung Jo; Jung-Kyung Lee; Je-won Kang | |
17.20-17.40 | Restoration of High-Frequency Components in Under Display Camera Images | Youngjin Oh; Gu Yong Park; Nam Ik Cho | |
17.40-18.00 | Non-Intrusive Speech Intelligibility Estimation Using Deep Learning With Speech Enhancement and Convolutional Layers | Kazushi Nakazawa; Kazuhiro Kondo | |
18.00-18.20 | Unified Angle Adjustment Network for Image Composition Enhancement | Jinwon Ko; Nyeong-Ho Shin; Seonho Lee; Chang-Su Kim | |
Session | Room | Chair | |
WedPM2-2 (Speech, Language, and Audio 1) | Chiang Mai 2 | Kasemsit Teeyapan | |
Date | Time | Title | Authors |
9 November 2022 | 16.20-16.40 | Automated Audio Captioning With Epochal Difficult Captions for Curriculum Learning | Andrew Koh; Soham Tiwari; Chng Eng Siong |
16.40-17.00 | Application of Deep Learning-Based Single-Channel Speech Enhancement for Frequency-Modulation Transmitted Speech | Ying Ma; Xueliang Zhang | |
17.00-17.20 | An Empirical Study of Training Mixture Generation Strategies on Speech Separation: Dynamic Mixing and Augmentation | Shukjae Choi; Younglo Lee; Jihwan Park; Hyung Yong Kim; Byeong-Yeol Kim; Zhong-Qiu Wang; Shinji Watanabe | |
17.20-17.40 | Speech Intelligibility Prediction for Hearing Aids Using an Auditory Model and Acoustic Parameters | Benita Angela Titalim; Candy Olivia Mawalim; Shogo Okada; Masashi Unoki | |
17.40-18.00 | Predicting Speech Fluency in Children Using Automatic Acoustic Features | Lionel Fontan; Shinyoung Kim; Verdiana De Fino; Sylvain Detey | |
18.00-18.20 | TC-SKNet With GridMask for Low-Complexity Classification of Acoustic Scene | Luyuan Xie; Yan Zhong; Lin Yang; Zhaoyu Yan; Zhonghai Wu; Junjie Wang | |
Session | Room | Chair | |
WedPM2-3 ( Deep Learning: Algorithm, Implementations, and Applications) | Chiang Mai 3 | Masaomi Kimura | |
Date | Time | Title | Authors |
9 November 2022 | 16.20-16.40 | Design and Control of a Muscle-Skeleton Robot Elbow Based on Reinforcement Learning | Jianyin Fan; Haoran Xu; Yuwei Du; Jing Jin; Qiang Wang |
16.40-17.00 | Non-Autoregressive Speech Recognition With Error Correction Module | Yukun Qian; Xuyi Zhuang; Zehua Zhang; Lianyu Zhou; Xu Lin; Mingjiang Wan | |
17.00-17.20 | A Method for Adversarial Example Generation by Perturbing Selected Pixels | KAMEGAWA Tomoki; KIMURA Masaomi | |
17.20-17.40 | A Title Generation Method With Transformer for Journal Articles | MATSUMOTO Riku; KIMURA Masaomi | |
17.40-18.00 | Catastrophic Forgetting Avoidance Method for a Classification Model by Model Synthesis and Introduction of Background Data | HIRAYAMA Akari; KIMURA Masaomi | |
18.00-18.20 | Consistency Regularization for GAN-Based Neural Vocoders | Kotaro Onishi; Toru Nakashika | |
18.20-18.40 | Parallel Training of TN and ITN Models Through CycleGAN for Improved Sequence to Sequence Learning Performance | Md. Mizanur Rahaman Nayan; Mohammad Ariful Haque | |
Session | Room | Chair | |
WedPM2-4 (SS14:Emerging Signal Processing Technology for Medical Applications/ Biomedical Signal Processing and Systems) | Board Room 2 | Yuttapong Jiraraksopakun | |
Date | Time | Title | Authors |
9 November 2022 | 16.20-16.40 | Laparoscope Manipulating Robot (LMR) Navigation Using Deep Learning-Based Surgical Instruments Detection | Nyi Nyi Myo; Apiwat Boonkong; Daranee Hormdee; Suphachoke Sonsilphong; Amornthep Sonsilphong; Kovit Khampitak |
16.40-17.00 | Human-Machine Interface Device Using Piezoelectric Sensors Based on Facial Muscle Movements for Wheelchair Control | Charoenporn Bouyam; Theerat Saichoo; Nannaphat Siribunyaphat; Yunyong Punsawad | |
17.00-17.20 | Obstructive Sleep Apnea Classification Using Snore Sounds Based on Deep Learning | Apichada Sillaparaya; Apichai Bhatranand; Chudanat Sudthongkhong; Kosin Chamnongthai; Yuttapong Jiraraksopakun | |
17.20-17.40 | Heart Rate Estimation of Car Driver Using Radar Sensors and Blind Source Separation | Keito Murata; Daichi Kitamura; Ryo Saito; Daichi Ueki | |
17.40-18.00 | Total Variation Algorithms for PAT Image Reconstruction | Mary Anjaley Josy John; Imad Barhumi | |
18.00-18.20 | Visual Function and Emotional Regulation in Achromatic Color and Chromatic Color Using Low Resolution Brain Electromagnetic Tomography Analysis (LORETA) | Watchara Sroykham; Yodchanan Wongsawat | |
18.20-18.40 | Effect of Electrooculography on Electroencephalography Classifying Accuracy in Deep Learning and Reducing Number of Channels in Motor-Imagery Brain-Computer Interface | Musashi Ino; Yoshihiro Kono; Nobuaki Kobayashi | |
Session | Room | Chair | |
WedPM2-5 (SS16: Emerging Techniques in Multimedia Data Analytics and Codings) | Board Room 3 | Patiwet Wuttisarnwattana/ Kampol Woradit | |
Date | Time | Title | Authors |
9 November 2022 | 16.20-16.40 | Optimal Deep Multi-Route Self-Attention for Single Image Super-Resolution | Nisawan Ngambenjavichaikul; Sovann Chen; Supavadee Aramvith |
16.40-17.00 | Object Detection in Aerial Images With Attention-Based Regression Loss | Chandler Timm C. Doloriel; Rhandley D. Cajote | |
17.00-17.20 | Performance Analysis of JPEG XR With Deep Learning-Based Image Super-Resolution | Taingliv Min; Supavadee Aramvith | |
17.20-17.40 | MCSNet: Multi-Channel Sharing Network for Single Image Super-Resolution | Wazir Muhammad; Supavadee Aramvith; Watchara Ruangsang | |
17.40-18.00 | DCAN: Deep Consecutive Attention Network for Video Super Resolution | Talha Saleem; Sovann Chen; Supavadee Aramvith | |
18.00-18.20 | Wiener Filter-Based Color Attribute Quality Enhancement for Geometry-Based Point Cloud Compression | Jinrui Xing; Hui Yuan; Chen Chen; Wei Gao | |
18.20-18.40 | Mixed Context Techniques in the Adaptive Arithmetic Coding Process for DC Term and Lossless Image Encoding | Evan Shih; Jian-Jiun Ding | |
Session | Room | Chair | |
WedPM2-6 (Signal Proceesing for Audio and Speech Applications) | Chiang Mai 4 | Sunao Hara/Sutasinee Thovuttikul | |
Date | Time | Title | Authors |
9 November 2022 | 16.20-16.40 | Prediction Method of Soundscape Impressions Using Environmental Sounds and Aerial Photographs | Yusuke Ono; Sunao Hara; Masanobu Abe |
16.40-17.00 | Robust Speech Dereverberation Based on Adaptive Weighted Prediction Error Algorithm With Eigenvector Extraction | Yitong Chen; Wen Zhang | |
17.00-17.20 | Multi-Task Learning for Speech Emotion and Emotion Intensity Recognition | Pengcheng Yue; Leyuan Qu; Shukai Zheng; Taihao Li | |
17.20-17.40 | Karaoke Generation From Songs: Recent Trends and Opportunities | Preet Patel; Ansh Ray; Khushboo Thakkar; Kahan Sheth; Sapan H Mankad | |
17.40-18.00 | Multi-Branch Learning for Noisy and Reverberant Monaural Speech Separation | Chao Ma; Dongmei Li | |
18.00-18.20 | Significance of Quadrature and In-Phase Components for Synthetic Spoofed Speech Detection | Priyanka Gupta; Piyushkumar K. Chodingala; Hemant A. Patil | |
Session | Room | Chair | |
WedPM2-7 (SS20: High Performance Intelligent Technologies for Image and Video Applications) | Chiang Mai 5 | Jing-Ming Guo | |
Date | Time | Title | Authors |
9 November 2022 | 16.20-16.40 | Mammography Quality Evaluation and Model Interpretation Based on CNN-Based Inframammary Fold Classification | Yi-Chong Zeng; Yu-Cheng Wu; Chen-Yen Yeh; Shu-Chi Li; Tzu-Han Chou; Yi-Wen Huang; Giu-Cheng Hsu; Hsian-He Hsu |
16.40-17.00 | Hybrid Image Compression Framework Based on Single Image Training | Tien-Ying Kuo; Yu-Jen Wei; Kuan-Yu Su | |
17.00-17.20 | Highly Robust Action Retrieval Using View-Invariant Pose Feature and Simple Yet Effective Query Expansion Method | Noboru Yoshida; Jianquan Liu | |
17.20-17.40 | A Unified Compression and Watermarking Scheme for MT-BTC Images | Jing-Ming Guo; Sankarasrinivasan Seshathiri | |
17.40-18.00 | Fusion With Hierarchical Graphs for Multimodal Emotion Recognition | Shuyun Tang; Zhaojie Luo; Guoshun Nan; Jun Baba; Yuichiro Yoshikawa; Hiroshi Ishiguro | |
18.00-18.20 | Multi-Stage Superpixel-Based Segmentation Algorithm Using Fully Convolutional Networks and Discriminative Features | Pei-Chi Huang; Jian-Jiun Ding | |
18.20-18.40 | Deep Learning Acceleration Design Based on Low-Rank Approximation | Yi-Hsiang Chang*, Gwo Giun (Chris) Lee*, Shiu-Yu Chen* | |
Session | Room | Chair | |
WedPM2-8 (Data Analytics and Machine Learning) | Board Room 4 | Wanus Srimaharaj | |
Date | Time | Title | Authors |
9 November 2022 | 16.20-16.40 | Internet of Behavior and Brain Response Identification for Cognitive Performance Analysis | Wanus Srimaharaj; Roungsan Chaisricharoen |
16.40-17.00 | Refinement of Utterance Fluency Feature Extraction and Automated Scoring of L2 Oral Fluency With Dialogic Features | Ryuki Matsuura; Shungo Suzuki; Mao Saeki; Tetsuji Ogawa; Yoichi Matsuyama | |
17.00-17.20 | A Vision Transformer-Based Approach to Bearing Fault Classification via Vibration Signals | Abid Hasan Zim; Aeyan Ashraf; Aquib Iqbal; Asad Malik; Minoru Kuribayashi | |
17.20-17.40 | Analysis Method for Motion Factors Related to Joint Contact Forces at the Knee During Walking Using Grad-CAM | Satoshi Suwa; Koh Inoue; Ryo Matsuoka | |
17.40-18.00 | A Dataset and a Lightweight Object Detection Network for Thermal Image-Based Home Surveillance | Zhengqiang Shao; Longbin Yan; Jie Chen; Jingdong Chen | |
18.00-18.20 | SCQ: Self-Supervised Cross-Modal Quantization for Unsupervised Large-Scale Retrieval | Fuga Nakamura; Ryosuke Harakawa; Masahiro Iwahashi | |
Session | Room | Chair | |
ThAM1-1 (Image Video Multimedia) | Chiang Mai 1 | Masaaki Ikehara | |
Date | Time | Title | Authors |
10 November 2022 | 9.00-9.20 | Single Image Raindrop Removal Using a Non-Local Operator and Feature Maps in the Frequency Domain | Shinya Ezumi; Masaaki Ikehara |
9.20-9.40 | Dual-Teacher Distillation for Low-Light Image Enhancement | Jeong-Hyeok Park; Tae-Hyeon Kim; Jong-Ok Kim | |
9.40-10.00 | Automatic Data Augmentation Method With Improved Interpretability for Image Classification in Computer Vision Applications | Dair Ungarbayev; Osman Demirel; Muhammad Tahir Akhtar | |
10.00-10.20 | Learning to Sharpen Partially Blurred Image via Iterative Blurred Region Mining and Recovery | Jung Yeh; Wen-Li Wei; Duan-Yu Chen; Jen-Chun Lin | |
10.20-10.40 | Shape-Bias Evaluation of Pretrained Models Using Image Decomposition | Akinori Iwata; Masahiro Okuda | |
10.40-11.00 | Proposal of Associative Watermarking Method | Ryoto Kanegae; Masaki Kawamura | |
Session | Room | Chair | |
ThAM1-2 (Speech, Language, and Audio 1) | Chiang Mai 2 | Ying Hu/ Toshio Irino | |
Date | Time | Title | Authors |
10 November 2022 | 9.00-9.20 | DMF-Net: A Decoupling-Style Multi-Band Fusion Model for Full-Band Speech Enhancement | Guochen Yu; Yuansheng Guan; Weixin Meng; Chengshi Zheng; Hui Wang; Yutian Wang |
9.20-9.40 | Speak Like a Dog: Human to Non-Human Creature Voice Conversion | Kohei Suzuki; Shoki Sakamoto; Tadahiro Taniguchi; Hirokazu Kameoka | |
9.40-10.00 | Pre-Trained Multimodal End-To-End Network for Spoken Language Assessment Incorporating Prompts | Binghuai Lin; Liyuan Wang | |
10.00-10.20 | Gated Fusion of Handcrafted and Deep Features for Robust Automatic Pronunciation Assessment | Binghuai Lin; Liyuan Wang | |
10.20-10.40 | Effective Data Screening Technique for Crowdsourced Speech Intelligibility Experiments: Evaluation With IRM-Based Speech Enhancement | Ayako Yamamoto; Toshio Irino; Shoko Araki; Kenichi Arai; Atsunori Ogawa; Keisuke Kinoshita; Tomohiro Nakatani | |
Session | Room | Chair | |
ThAM1-3 (Deep Learning: Algorithm, Implementations, and Applications) | Chiang Mai 3 | Kasemsit Teeyapan | |
Date | Time | Title | Authors |
10 November 2022 | 9.00-9.20 | Leveraging Pre-Trained Acoustic Feature Extractor for Affective Vocal Bursts Tasks | Bagus Tris Atmaja; Akira Sasou |
9.20-9.40 | Flow-Based Variational Sequence Autoencoder | Jen-Tzung Chien; Tien-Ching Luo | |
9.40-10.00 | Speech Intelligibility Prediction Through Direct Estimation of Word Accuracy Using Conformer | Naoyuki Kamo; Kenichi Arai; Atsunori Ogawa; Shoko Araki; Tomohiro Nakatani; Keisuke Kinoshita; Marc Delcroix; Tsubasa Ochiai; Toshio Irino | |
10.00-10.20 | DNN-Rule Hybrid Dyna-Q for Sample-Efficient Task-Oriented Dialog Policy Learning | Mingxin Zhang; Takahiro Shinozaki | |
10.20-10.40 | MoCoVC: Non-Parallel Voice Conversion With Momentum Contrastive Representation Learning | Kotaro Onishi; Toru Nakashika | |
10.40-11.00 | Controllable Voice Conversion Based on Quantization of Voice Factor Scores | Takumi Isako; Kotaro Onishi; Takuya Kishida; Toru Nakashika | |
Session | Room | Chair | |
ThAM1-4 (Biomedical Signal Processing and Systems) | Board Room 2 | Daranee Hormdee | |
Date | Time | Title | Authors |
10 November 2022 | 9.00-9.20 | Deep Adaptive Denoising Auto-Encoder Networks for ECG Noise Cancelation via Time-Frequency Domain | Amir Mohammadisarab; Poorya Aghaomidi; Jalil Mazloum; Mohammad Ali Akbarzadeh; Mahdi Orooji; Nader Mokari; Halim Yanikomeroglu |
9.20-9.40 | User-Item Recommendation Approaches to Detect Genomic Variant Interactions | Emma Andrade; Nicholas Tom; Mario Banuelos | |
9.40-10.00 | Teager Energy Cepstral Coefficients for Classification of Dysarthric Speech Severity-Level | Aastha Kachhi; Anand Therattil; Ankur T. Patil; Hardik B. Sailor; Hemant A. Patil | |
10.00-10.20 | Decoding Emotional Valence from EEG in Immersive Virtual Reality | Guanxiong Pei; Bingjie Li; Taihao Li; Ruohao Xu; Jianmin Dong; Jia Jin | |
10.20-10.40 | Design of A Wearable System for Hypoxic Training Management Using Blood Oxygenation and Heart Rate | Takuma Kitagawa; Toshitaka Yamakawa | |
10.40-11.00 | MedBERT: A Pre-Trained Language Model for Biomedical Named Entity Recognition | Charangan Vasantharajan; Kyaw Zin Tun; Ho Thi-Nga; Sparsh Jain; Tong Rong; Chng Eng Siong | |
Session | Room | Chair | |
ThAM1-5 (SS21: Recent Advances and Applications in Encrypted Domain) | Board Room 3 | Simying Ong | |
Date | Time | Title | Authors |
10 November 2022 | 9.00-9.20 | Encrypted JPEG Image Retrieval via Huffman-Code Based Self-Attention Networks | Zhixun Lu; Qihua Feng; Peiya Li |
9.20-9.40 | Reversible Data Hiding in Encrypted Text Using Paillier Cryptosystem | Asad Malik; Aeyan Ashraf; Hanzhou Wu; Minoru Kuribayashi | |
9.40-10.00 | Scrambling-Embedding in Partially-Encrypted Images | Koi Yee Ng, Simying Ong | |
10.00-10.20 | Image Classification Using Vision Transformer for EtC Images | Genki HAMANO; Shoko IMAIZUMI; Hitoshi KIYA | |
10.20-10.40 | Image Watermarking Based on Saliency Detection and Multiple Transformations | Ahmed Khan; KokSheik Wong; Vishnu Monn Baskaran | |
Session | Room | Chair | |
ThAM1-6 (SS19: Towards real-world human-centric acoustic signal processing) | Chiang Mai 4 | Sermsak Uatrongjit | |
Date | Time | Title | Authors |
10 November 2022 | 9.00-9.20 | A Fast Converge Spectral Modulation Sensitive Active Noise Control System | Kah-Meng Cheong; Yih Liang Shen; Tai-Shih Chi |
9.20-9.40 | Multimodal Forgery Detection Using Ensemble Learning | Ammarah Hashmi; Sahibzada Adil Shahzad; Wasim Ahmad;Chia Wen Lin;Yu Tsao;Hsin-Min Wang | |
9.40-10.00 | Speech Enhancement-Assisted Voice Conversion in Noisy Environments | Yun-Ju Chan; Chiang-Jen Peng; Syu-Siang Wang; Hsin-Min Wang; Yu Tsao; Tai-Shih Chi | |
10.00-10.20 | Effect of Noise on the Perceptual Contribution of Cochlea-Scaled Entropy and Speech Level in Mandarin Sentence Understanding | Weikang Wu; Shangdi Liao; Fei Chen | |
10.20-10.40 | EEG-Based Auditory Attention Detection With Estimated Speech Sources Separated From an Ideal-Binary-Masking Process | Lei Wang; Fei Chen | |
10.40-11.00 | Automatic Step Detection of Tandem Gait Test in Patients With Vestibular Hypofunction Using Wearable Sensors | Yi-Ju Huang; Chien-Pin Liu; Kuan-Chung Ting; Chia-Yeh Hsieh; Kai-Chun Liu; Chia-Tai Chan | |
Session | Room | Chair | |
ThAM1-7 (SS22: Recent Advances in Biometrics and Security) | Chiang Mai 5 | Koichi Ito | |
Date | Time | Title | Authors |
10 November 2022 | 9.00-9.20 | Continuous Authentication for Smartphones Using Face Images and Touch-Screen Operation | Shuto Kinoshita; Yuka Watanabe; Yasushi Yamazaki |
9.20-9.40 | Spoofing Attack Detection in Face Recognition System Using Vision Transformer With Patch-Wise Data Augmentation | Kota Watanabe; Koichi Ito; Takafumi Aoki | |
9.40-10.00 | A Simple and Accurate CNN for Iris Recognition | Shokei Kawakami; Hiroya Kawai; Koichi Ito; Takafumi Aoki; Yoshiko Yasumura; Masakazu Fujio; Yosuke Kaga; Kenta Takahashi | |
10.00-10.20 | Eyeglass Frame Segmentation for Face Image Processing | Kanta Miura; Takamichi Miyamoto; Kazuyuki Sakurai; Koichi Ito; Takafumi Aoki | |
10.20-10.40 | A Fair Model is Not Fair in a Biased Environment | Yuya Sato; Soshi Maeda; Muku Akasaka; Masakatsu Nishigaki; Tetsushi Ohki | |
Session | Room | Chair | |
ThAM1-8 (Other related speech processing) | Board Room 4 | Sansanee Auephanwiriyakul | |
Date | Time | Title | Authors |
10 November 2022 | 9.00-9.20 | Intelligibility Prediction of Enhanced Speech Using Recognition Accuracy of End-To-End ASR System | Kenichi Arai; Atsunori Ogawa; Shoko Araki; Keisuke Kinoshita; Tomohiro Nakatani; Naoyuki Kamo; Toshio Irino |
9.20-9.40 | Hi, KIA: A Speech Emotion Recognition Dataset for Wake-Up Words | Taesu Kim; SeungHeon Doh; Gyunpyo Lee; Hyeongseok Jeon; Juhan Nam; Hyeon-Jeong Suk | |
9.40-10.00 | Improving Speech Emotion Recognition via Fine-Tuning ASR With Speaker Information | Bao Thang Ta, Tung Lam Nguyen, Dinh Son Dang, Nhat Minh Le, Van Hai Do | |
10.00-10.20 | 3CMLF: Three-Stage Curriculum-Based Mutual Learning Framework for Audio-Text Retrieval | Yi-Wen Chao; Dongchao Yang; Rongzhi Gu; Yuexian Zou | |
Session | Room | Chair | |
ThPM1-1 (Image Video Multimedia) | Chiang Mai 1 | Masaki Kawamura | |
Date | Time | Title | Authors |
10 November 2022 | 12.30-12.50 | Neural Network Based Watermarking Trained With Quantized Activation Function | Shingo Yamauchi; Masaki Kawamura |
12.50-13.10 | A Multiframe Super-Resolution Pipeline for Sub-Image-Typed Light Field Data | Chien-Han Hsu; Yi-Hsien Lin; Yen-Po Lin; Yi-Chang Lu | |
13.10-13.30 | Restoring Edge and Color Using Weighted Near-Infrared Image and Color Transmission Maps for Robust Haze Removal | Onhi Kato; Akira Kubota | |
13.30-13.50 | Dense View Interpolation of 4D Light Fields for Real-Time Augmented Reality Applications | Hidemichi Yoshino; Kazuya Kodama; Takayuki Hamamoto | |
13.50-14.10 | Bolt Looseness Identification Using Faster R-CNN and Grid Mask Augmentation | Natchapon Panmatharit; Yuttapong Jiraraksopakun; Anek Siripanichgorn; Punnarai Siricharoen | |
14.10-14.30 | Large-Scale Blind Face Super-Resolution via Edge Guided Frequency Aware Generative Facial Prior Networks | Xi Cheng; Wan-Chi Siu; Jian Yang | |
Session | Room | Chair | |
ThPM1-2 (Speech, Language, and Audio 1) | Chiang Mai 2 | Takanobu Nishiura | |
Date | Time | Title | Authors |
10 November 2022 | 12.30-12.50 | Language-Based Audio Retrieval With Converging Tied Layers and Contrastive Loss | Andrew Koh; Chng Eng Siong |
12.50-13.10 | D²Net: A Denoising and Dereverberation Network Based on Two-Branch Encoder and Dual-Path Transformer | Liusong Wang; Wenbing Wei; Yadong Chen; Ying Hu | |
13.10-13.30 | Direct Speech-Reply Generation From Text-Dialogue Context | Kenichi Fujita; Yusuke Ijima; Hiroaki Sugiyama | |
13.30-13.50 | Sequence-Wise Optimization for Quasi-Harmonic Speech Waveform Modeling | Shaowen Chen; Tomoki Toda | |
13.50-14.10 | Lattice-Based Data Augmentation for Code-Switching Speech Recognition | Roland Hartanto; Kuniaki Uto; Koichi Shinoda | |
14.10-14.30 | Phase-Aware Audio Super-Resolution for Music Signals Using Wasserstein Generative Adversarial Network | Yanqiao Yan; Binh Thien Nguyen; Yuting Geng; Kenta Iwai; Takanobu Nishiura | |
Session | Room | Chair | |
ThPM1-3 (Deep Learning: Algorithm, Implementations, and Applications) | Chiang Mai 3 | Jen-Chun Lin | |
Date | Time | Title | Authors |
10 November 2022 | 12.30-12.50 | Speech Emotion Recognition Based on the Reconstruction of Acoustic and Text Features in Latent Space | Jennifer Santoso; Rintaro Sekiguchi; Takeshi Yamada; Kenkichi Ishizuka; Taiichi Hashimoto; Shoji Makino |
12.50-13.10 | A Light CNN With Split Batch Normalization for Spoofed Speech Detection Using Data Augmentation | Haojian Lin; Yang Ai; Zhenhua Ling | |
13.10-13.30 | On the Optimal Classifier for Affective Vocal Bursts and Stuttering Predictions Based on Pre-Trained Acoustic Embedding | Bagus Tris Atmaja; Zanjabila; Akira Sasou | |
13.30-13.50 | Nonlinear Residual Echo Suppression Based on Gated Dual Signal Transformation LSTM Network | Kai Xie; Ziye Yang; Jie Chen | |
13.50-14.10 | Adaptive End-To-End Text-To-Speech Synthesis Based on Error Correction Feedback From Humans | Kazuki Fujii; Yuki Saito; Hiroshi Saruwatari | |
14.10-14.30 | Adversarial Speaker-Consistency Learning Using Untranscribed Speech Data for Zero-Shot Multi-Speaker Text-To-Speech | Byoung Jin Choi; Myeonghun Jeong; Minchan Kim; Sung Hwan Mun; Nam Soo Kim | |
Session | Room | Chair | |
ThPM1-4 (SS07: Latest Wireless Technologies for Sensing and Communications) | Board Room 2 | Osamu Takyu | |
Date | Time | Title | Authors |
10 November 2022 | 12.30-12.50 | Performance Evaluation of FISTA With Constant Inertial Parameter | Kaito Kameda; Ryo Hayakawa; Kazunori Hayashi; Youji Iiguni |
12.50-13.10 | An Approximated ADMM Based Algorithm for \(\ell_1-\ell_2\) Optimization Problem | Rui Lin; Kazunori Hayashi | |
13.10-13.30 | Antenna Beamforming Selection With Low Complexity and High Exploitation of White Space in Frequency Spectrum Sharing | Kizuku Kawamura; Kohei Akimoto; Osamu Takyu | |
13.30-13.50 | Individual Memory Driven Transformer Deep Learning Model for Multi-Cell Massive MIMO Beam Prediction | Taisei Urakami; Haohui Jia; Na Chen; Minoru Okada | |
13.50-14.10 | Deep Unfolding-Aided Sum-Product Algorithm for Error Correction of CRC Coded Short Message | Qilin Zhang; Shinsuke Ibi; Takumi Takahashi; Hisato Iwai | |
14.10-14.30 | Successive Interference Cancellation for Signal Demodulation of Multiple LPWA Systems | Shinichiro Kakuda; Takeo Fujii; Shusuke Narieda | |
Session | Room | Chair | |
ThPM1-5 (SS08: Digital Convergence of 5G/B5G, AIoT and Security) | Board Room 3 | Kampol Woradit | |
Date | Time | Title | Authors |
10 November 2022 | 12.30-12.50 | Evaluation of Voice Service in LEO Communication With 3GPP PUSCH Repetition Enhancement | Shou-Hong Liu; Chun-Tai Liu; Wei-Hung Chou; JenYi Pan |
12.50-13.10 | Modeling of Malware Diffusion With Mobile Devices in Intermittently Connected Networks | Hideyoshi Miura; Shoya Abukawa; Tomotaka Kimura; Kouji Hirata | |
13.10-13.30 | Software Defined Radio Access Network Sharing by Multi-Operator Core Networks | Wen-Ping Lai; Wen-Ru Chen; Ming-Jay Lai; Hong-Lun Lai; Chia-Ying Lin; Po-Chen Tseng | |
13.30-13.50 | Machine Learning Based End-To-End Constellation Training for Communication Systems | Po-Chiang Lin | |
13.50-14.10 | Flow-Based DDoS Detection Using Deep Neural Network With Radial Basis Function Neural Network | Ting-Chung Leung; Lee Chung-Nan | |
14.10-14.30 | Implement a Continuous Learning Model to Detect Different Types of DDoS Attacks With Hierarchical Temporal Memory | Hung Manh Nguyen; Yu-Kuen Lai | |
Session | Room | Chair | |
ThPM1-6 (SS23: Selected Papers from APSIPA Workshop in Hanoi, Vietnam) | Chiang Mai 4 | Nguyen Linh Trung | |
Date | Time | Title | Authors |
10 November 2022 | 12.30-12.50 | Dynamic Hand Gesture Recognition From Egocentric Videos Based on SlowFast Architecture | Ha-Dang Ho, Hong-Quan Nguyen, Thuy-Binh Nguyen, Sinh-Thuong Vu, Thi-Lan Le |
12.50-13.10 | Deep Learning-Based Signal Detection for Dual-Mode Index Modulation 3D-OFDM | Dang-Y Hoang, Tien-Hoa Nguyen, Vu-Duc Ngo, Trung Tan Nguyen†, Nguyen Cong Luong, Thien Van Luong | |
13.10-13.30 | A Comparison of Feature Selection and Feature Extraction in Network Intrusion Detection Systems | Tuan-Cuong Vuong, Hung Tran, Mai Xuan Trang, Vu-Duc Ngo, Thien Van Luong | |
13.30-13.50 | Deep Neural Network-Based Detector for Single-Carrier Index Modulation NOMA | Toan Gian, Vu-Duc Ngo,Tien-Hoa Nguyen, Trung tan Nguyen, Thien Van Luong | |
13.50-14.10 | Vibration Measurement Using Spatial Shifting Coherent Digital Holography | Long Hai Ngo; Quang Duc Pham | |
14.10-14.30 | Robust Online Tucker Dictionary Learning From Multidimensional Data Streams | Le Trung Thanh; Tran Trong Duy; Karim Abed-Meraim; Nguyen Linh Trung; Adel Hafiane | |
Session | Room | Chair | |
ThPM1-7 (SS06: Adversarial Attacks and Defense) | Chiang Mai 5 | Minoru Kuribayashi | |
Date | Time | Title | Authors |
10 November 2022 | 12.30-12.50 | Survey on Vision Based Fake News Detection and Its Impact Analysis | Mehul S Raval; Mohendra Roy; Minoru Kuribayashi |
12.50-13.10 | StyleGAN Encoder-Based Attack for Block Scrambled Face Images | AprilPyone MaungMaung; Hitoshi Kiya | |
13.10-13.30 | On the Adversarial Transferability of ConvMixer Models | Ryota Iijima; Miki Tanaka; Isao Echizen; Hitoshi Kiya | |
13.30-13.50 | Detection and Correction of Adversarial Examples Based on JPEG-Compression-Derived Distortion | Kenta Tsunomori; Yuma Yamasaki; Minoru Kuribayashi; Nobuo Funabiki; Isao Echizen | |
13.50-14.10 | Defense Against Adversarial Examples Using Beneficial Noise | Param Raval; Harin Khakhi; Minoru Kuribayashi; Mehul S. Raval | |
14.10-14.30 | Privacy Protection Against Automated Tracking System Using Adversarial Patch | Hiroto Takiwaki; Minoru Kuribayashi; Nobuo Funabiki; Mehul Shirishchandra Raval | |
Session | Room | Chair | |
ThPM1-8 (Industrial Forum "New era opened by AI-based image processing) | Board Room 4 | Jangwoo Kwon | |
Date | Time | Title | Authors |
10 November 2022 | 12.30-14.30 | Towards Best Possible Deep Learning Acceleration on the Edge – A Compression-Compilation Co-Design Framework | Yanzhi Wang, Northeastern University, Chairman and former CEO of CoCoPIE Inc., USA |
Empowering Future Pathology with Artificial Intelligence | Shuhao Wang, Co-founder and CTO of Thorough Future, China | ||
Session | Room | Chair | |
ThPM2-1 (Image Video Multimedia) | Chiang Mai 1 | Nam Ik Cho | |
Date | Time | Title | Authors |
10 November 2022 | 15.00-15.20 | Syllable Analysis Data Augmentation for Khmer Ancient Palm Leaf Recognition | Nimol Thuon; Jun Du; Jianshu Zhang |
15.20-15.40 | Multi-Class Vehicle Counting System for Multi-View Traffic Videos | Wichukorn Kuntintara; Kanokphan Lertniphonphan; Punnarai Siricharoen | |
15.40-16.00 | Table Structure Recognition Based on Grid Shape Graph | Eunji Lee; Junhyeong Kwon; Haeyoon Yang; Jaewoo Park; Soonyoung Lee; Hyung Il Koo; Nam Ik Cho | |
16.00-16.20 | Feature Distillation Network for Multi-Band NIR Colorization | Tae-Sung Park; Tae-Hyeon Kim; Jong-Ok Kim | |
16.20-16.40 | Blur Detection for Surveillance Camera System | Yikun Pan, Sik-Ho Tsang, Yui-Lam Chan, Daniel P.K. Lun | |
16.40-17.00 | Lip Sync Matters: A Novel Multimodal Forgery Detector | Sahibzada Adil Shahzad; Ammarah Hashmi; Sarwar Khan; Yan-Tsung Peng; Yu Tsao; Hsin-Min Wang | |
Session | Room | Chair | |
ThPM2-2 (Speech, Language, and Audio 1) | Chiang Mai 2 | Kittichai Wantanajittikul | |
Date | Time | Title | Authors |
10 November 2022 | 15.00-15.20 | Frame-Level Matching Scheme Using Posteriorgram Probability Distance of Spoken Data to Improve Search Accuracy of Spoken Term Detection | Reo Minakawa; Kazunori Kojima; Shi-wook Lee; Yoshiaki Itoh |
15.20-15.40 | Empirical Study Incorporating Linguistic Knowledge on Filled Pauses for Personalized Spontaneous Speech Synthesis | Yuta Matsunaga; Takaaki Saeki; Shinnosuke Takamichi; Hiroshi Saruwatari | |
15.40-16.00 | Using Perceptual Quality Features in the Design of the Loss Function for Speech Enhancement | Nicholas Eng; Yusuke Hioka; Catherine I Watson | |
16.00-16.20 | Correlation Loss for MOS Prediction of Synthetic Speech | Beibei Hu; Qiang Li | |
16.20-16.40 | Back-Translation-Style Data Augmentation for Mandarin Chinese Polyphone Disambiguation | Chunyu Qiang; Peng Yang; Hao Che; Jinba Xiao; Xiaorui Wang; Zhongyuan Wang | |
16.40-17.00 | Classification of Short Audio Acoustic Scenes Based on Data Augmentation Methods | Xuan Zhang; Yunfei Shao; Junjie Xu; Yong Ma; Wei-Qiang Zhang | |
Session | Room | Chair | |
ThPM2-3 (Deep Learning: Algorithm, Implementations, and Applications) | Chiang Mai 3 | Kasemsit Teeyapan | |
Date | Time | Title | Authors |
10 November 2022 | 15.00-15.20 | Improving Unsupervised Anomalous Sound Detection Performance of Autoencoder and Its Variant With Pretrained Deep Belief Network | Yufeng Deng; Jia Liu; Wei-Qiang Zhang |
15.20-15.40 | ASGAN-VC: One-Shot Voice Conversion With Additional Style Embedding and Generative Adversarial Networks | Wei-Cheng Li; Tzer-Jen Wei | |
15.40-16.00 | Fusing Multiple Bandwidth Spectrograms for Improving Speech Enhancement | Hao Shi; Yuchun Shu; Longbiao Wang; Jianwu Dang; Tatsuya Kawahara | |
16.00-16.20 | End-To-End Two-Dimensional Sound Source Localization With Ad-Hoc Microphone Arrays | Yijun Gong; Shupei Liu; Xiao-Lei Zhang | |
16.20-16.40 | Exploring Speaker Age Estimation on Different Self-Supervised Learning Models | Tuan Duc Truong; Tran The Anh; Eng-Siong Chng | |
16.40-17.00 | Mandarin Singing Voice Synthesis With Denoising Diffusion Probabilistic Wasserstein GAN | Yin-Ping Cho; Yu Tsao; Hsin-Min Wang; Yi-Wen Liu | |
Session | Room | Chair | |
ThPM2-4 (SS18: Metaverse: Future of Internet) | Board Room 2 | Navadon Khunlertgit | |
Date | Time | Title | Authors |
10 November 2022 | 15.00-15.20 | Physiological Study on the Effect of Game Events in Response to Player's Laughter | Mikito Fukuda; Yoshiko Arimoto |
15.20-15.40 | Development of a Virtual Telecommunication System Research Laboratory | Siwanart Jearavongtakul; Imran Saeed Mirza; Lunchakorn Wuttisittikulkij; Pruk Sasithong; Suebphong Noisri; Pisit Vanichchanunt | |
15.40-16.00 | Camera-Based Log System for Human Physical Distance Tracking in Classroom | Somrudee Deepaisarn; Angkoon Angkoonsawaengsuk; Charn Arunkit; Chayud Srisumarnk; Krongkan Nimmanwatthana; Nanmanas Linphrachaya; Nattapol Chiewnawintawat; Rinrada Tanthanathewin; Sivakorn Seinglek; Suphachok Buaruk; Virach Sornlertlamvanich | |
16.00-16.20 | Detecting Replay Attacks Using Single-Channel Audio: The Temporal Autocorrelation of Speech | Shih-Kuang Lee; Yu Tsao; Hsin-Min Wang | |
Session | Room | Chair | |
ThPM2-5 ( Wireless Communication and networking) | Board Room 3 | Poompat Saengudomlert | |
Date | Time | Title | Authors |
10 November 2022 | 15.00-15.20 | Automatic Detection of Dimmable Pulse Position Modulation for Visible Light Communication | Poompat Saengudomlert; Karel Sterckx |
15.20-15.40 | Estimation of Angular Power Spectrum Using Multikernel Adaptive Filtering | Eiji Ninomiya; Masahiro Yukawa; Renato L. G. Cavalcante; Lorenzo Miretti | |
15.40-16.00 | Novel Smart Sectoring and Beam Designs in mmWave Broadcast Channels | Yan-Yin He; Shang-Ho (Lawrence) Tsai; Jen-Ming Wu | |
16.00-16.20 | New Methods for Fast Detection for Embedded Cognitive Radio | Grégoire de Broglie; Louis Morge-Rollet; Denis Le Jeune; Frédéric Le Roy; Christian Roland; Charles Canaff; Jean-Philippe Diguet | |
Session | Room | Chair | |
ThPM2-6 (SS23: Selected Papers from APSIPA Workshop in Hanoi, Vietnam) | Chiang Mai 4 | Nguyen Linh Trung | |
Date | Time | Title | Authors |
10 November 2022 | 15.00-15.20 | Needle Localization and Segmentation for Radiofrequency Ablation of Liver Tumors Under CT Image Guidance | Le Quoc Anh; Luu Manh Ha; Theo van Walsum; Adriaan Moelker; Dao Viet Hang; Pham Cam Phuong; Vu Duy Thanh |
15.20-15.40 | End-To-End Visual-Guided Audio Source Separation With Enhanced Losses | Duc-Huy Pham; Quang-Anh Do; Thanh Thi-Hien Duong; Thi-Lan Le; Phi Le Nguyen | |
15.40-16.00 | Automated Classification of Lung Injury From X-Ray Images Using Deep Learning Network | Huy Le; Thanh-Ha Do | |
16.00-16.20 | AI-Based Video Analysis for Traffic Monitoring | Bui Son Tung; Phung The Ngoc; Do Duy Thanh; Nguyen Hong Thinh | |
16.20-16.40 | Adaptive Filtering-Based Heavy-Noise Removal in Born Iterative Method | Tran Quang-Huy; Luong Thi Theu; Nguyen Canh Minh; Duc-Nghia Tran; Duc-Tan Tran | |
16.40-17.00 | A Novel Deep Learning-Based Approach for Sleep Apnea Detection Using Single-Lead ECG Signals | Anh-Tu Nguyen; Thao Nguyen; Huy-Khiem Le; Huy-Hieu Pham; and Cuong Do | |
Session | Room | Chair | |
ThPM2-7 (SS15: Advanced Sensing Technologies using Wireless Signal) | Chiang Mai 5 | Kampol Woradit | |
Date | Time | Title | Authors |
10 November 2022 | 15.00-15.20 | Multi-Resolution GPR Clutter Suppression Method Based on Low-Rank and Sparse Decomposition | Yanjie Cao; Xiaopeng Yang; Tian Lan |
15.20-15.40 | Indoor Human Motion Recognition Method Based on Kernel-Distance Doppler Velocity Estimation and Lightweight Network | Weicheng Gao; Xiaopeng Yang; Xiaodong Qu; Jiancheng Liao; Zixiang Yin; Ding Zhang | |
15.40-16.00 | Mainlobe Interference Suppression Method Based on Blocking Matrix Preprocessing With Low Sidelobe Constraint | Meng Haoyu; Qu Xiaodong; Zhang Xingyu; Li Wolin; Zhang Zhengyan; Yang Xiaopeng | |
16.00-16.20 | Continuous Tracking of Indoor Human Targets Based on Millimeter Wave Radar | Meiqiu Jiang; Shisheng Guo; Haolan Luo; Guolong Cui | |
16.20-16.40 | Reconfigurable Intelligent Surfaces Aided WiFi Imaging | Ying He; Dongheng Zhang; Yan Chen | |
16.40-17.00 | Continuous User Authentication Using WiFi | Pengcheng Huang; Dongheng Zhang; Ruixu Geng; Yan Chen | |
CONFERENCE FORMAT
The conference is planned to be in presence. However, if there are some travel restrictions for some authors at the time, we will allow them to upload their videos for the oral presentation. The presenter must attend the session online for Q&A. This will however mean that there will be no live streaming of the conference presentations, as in the hybrid conference. For more information please contact: apsipa2022@gmail.com