13th Asia Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)

DECEMBER 14 – 17, 2021, TOKYO, JAPAN
Venue: KFC Hall & Rooms
Kokusai Fashion Centre Bldg., Yokoami 1-6-1, Sumida City, Tokyo

Signal & Information Processing — Science for Signals, Data, and Intelligence

Important Dates

[April 1, 2021] Submission of Proposals for Special Sessions [May 1, 2021] Submission of Proposals for Forum, Panel & Tutorial Sessions [July 15, 2021] Submission of Regular Papers [July 15, 2021] Submission of Special Session Papers [July 16 to September 10, 2021] Submission of Research Abstract  [August 31, 2021] Notification of Papers Acceptance [October 1, 2021] Submission of Camera‑Ready Papers [October 1, 2021] Author (Early-Bird) Registration Deadline [December 14 – 17, 2021] Tutorials, Summit and Conference Dates

Tutorial

+T1: Speech Perception and Enhancement in Cochlear Implants

Fei Chen
(Southern University of Science and Technology, China)
Yu Tsao
(Academia Sinica, Taiwan)
Title
Speech Perception and Enhancement in Cochlear Implants
Presenters
Fei Chen (Southern University of Science and Technology, China), Yu Tsao (Academia Sinica, Taiwan)
Abstract

Cochlear implant (CI) is currently the only medical treatment available to partially restore hearing to patients with profound-to-severe hearing loss, and in recent years, CI has evolved into one of the most profound advances in modern medicine.

In this tutorial, we will first review fundamentals of CI speech perception. Specially, a vocoder model will be introduced to CI model speech perception, which helps to understand the perceptual impacts of important acoustic factors. In addition, combined HA and CI based speech perception will be reviewed, and CI speech perception and coding strategies for tonal languages will be presented. Objective methods to evaluate CI speech perception performance in noise will be introduced, including intrusive measures and non-intrusive measures.

The second part of this tutorial will present existing system architecture and fundamental theories of deep learning based speech enhancement (SE) approaches, and the reinforcement learning and generative adversarial network (GAN)-based SE methods. We will introduce speech enhancement studies to improve CI speech understanding in noise. Due to hearing loss, CI users normally have a reduced hearing range, which requires CI speech processor to compress the dynamic range of temporal envelope. This tutorial will also report some works that conduct adaptive envelope dynamic compression to improve CI speech perception in noise and in reverberation.

Fei Chen
Fei Chen received the B.Sc. and M.Phil. degrees from the Department of Electronic Science and Engineering, Nanjing University in 1998 and 2001, respectively, and the Ph.D. degree from the Department of Electronic Engineering, The Chinese University of Hong Kong in 2005. He continued his research as post-doctor and senior research fellow in University of Texas at Dallas (supervised by Prof. Philipos Loizou) and The University of Hong Kong. He is now a full professor at Department of Electrical and Electronic Engineering, Southern University of Science and Technology (SUSTech), Shenzhen, China. Dr. Chen is leading the speech and physiological signal processing (SPSP) research group in SUSTech, with research focus on speech perception, speech intelligibility modeling, speech enhancement, and assistive hearing technology. He published over 100 journal papers and over 80 conference papers in IEEE journals/conferences, Interspeech, Journal of Acoustical Society of America, etc. He received the Martin Black Prize for best paper published in the journal Physiological Measurement in 2020, the best presentation award in the 9th Asia Pacific Conference of Speech, Language and Hearing, and 2011 National Organization for Hearing Research Foundation Research Awards in States. Dr. Chen is now serving as associate editor of Biomedical Signal Processing and Control, Frontiers in Human Neuroscience.
Yu Tsao
Yu Tsao received the B.S. and M.S. degrees in electrical engineering from National Taiwan University, Taipei, Taiwan, in 1999 and 2001, respectively, and the Ph.D. degree in electrical and computer engineering from the Georgia Institute of Technology, Atlanta, GA, USA, in 2008. From 2009 to 2011, he was a Researcher with the National Institute of Information and Communications Technology, Tokyo, Japan, where he engaged in research and product development in automatic speech recognition for multilingual speech-to-speech translation. He is currently a Research Fellow (Professor) and Deputy Director with the Research Center for Information Technology Innovation, Academia Sinica, Taipei. His research interests include speech and speaker recognition, acoustic and language modeling, audio coding, and bio-signal processing. He is currently an Associate Editor for the IEEE/ACM Transactions on Audio, Speech, and Language Processing and IEEE Signal Processing Letters and a Distinguished Lecturer of APSIPA. He was the recipient of the Academia Sinica Career Development Award in 2017, the National Innovation Award in 2018, 2019, 2020, Future Tech Breakthrough Award 2019, and the Outstanding Elite Award, Chung Hwa Rotary Educational Foundation 2019–2020.

+ T2: In Silico Experiment Design for Hypothesis Testing and Generation

Pavel Loskot
(ZJU-UIUC Institute, China)
Title
In Silico Experiment Design for Hypothesis Testing and Generation
Presenters
Pavel Loskot (ZJU-UIUC Institute, China)
Abstract

Experimental methods remain at the forefront to tackle challenging research problems. The rapid advancements and availability of computing resources encouraged a wide adoption of computer simulations to study mathematical models of various systems across many scientific and engineering disciplines. In biology and medicine, computer simulations are referred to as in silico experiments. They can greatly reduce the need to carry out time and cost demanding in vivo and in vitro experiments involving living subjects. However, unlike laboratory experiments, the in silico experiments are less understood how they should be designed to enable scientific discoveries and confirm existing knowledge that can be validated, explained and reproduced. This tutorial will review the traditional experiment design strategies, and how they can be adopted for the hypothesis testing and hypothesis generation by Monte Carlo simulations. It requires decisions on how long and how many times to run simulations, how to configure every simulation run, how to integrate simulation outputs, what data need to be collected and how they should be processed, how to choose appropriate statistical tests, and how to assess the quality of the results obtained. The in silico experiment design involves different statistical methods such as parameter inference, causal reasoning, hypothesis testing, sensitivity analysis, and event detection in order to improve the information gain and efficiency of Monte Carlo simulations.

Pavel Loskot
Pavel Loskot joined the ZJU-UIUC Institute in Haining, China, in January 2021 as the Associate Professor after spending 14 years at Swansea University in the UK. He received his PhD degree in Wireless Communications from the University of Alberta in Canada, and the MSc and BSc degrees in Radioelectronics and Biomedical Electronics, respectively, from the Czech Technical University of Prague in the Czech Republic. He is the Senior Member of the IEEE, Fellow of the Higher Education Academy in the UK, and the Recognized Research Supervisor of the UK Council for Graduate Education. His current research interests focus on problems involving statistical signal processing and importing methods from Telecommunication Engineering and Computer Science to other disciplines in order to improve the efficiency and information power of system modeling and analysis.

+ T3: Fake News Detection and its impact analysis

Mehul S Raval
(Ahmedabad University, India)
Mohendra Roy
(Pandit Deendayal Energy University, India)
Title
Fake News Detection and its impact analysis
Presenters
Mehul S Raval (Ahmedabad University, India), Mohendra Roy (Pandit Deendayal Energy University, India)
Abstract

Recently, we have witnessed a significant increase in fake news and media content, which has caused considerable unrest globally. Fake news has also been implanted to influence the financial market, geopolitical issues and affect democratic structures. Deep learning-based generative models have recently increased the complexity of fake news by creating ultra-realistic phony media content. These synthesized fake media contents are almost impossible to distinguish, even with state-of-the-art algorithms/software. It makes a threat to forensic set-ups, where videos and images are considered as indisputable evidence to provide justice. Therefore, many organizations are now focusing on deep fake detection research.

In this tutorial, we are going to discuss the recent development in deep fake research. Here we will demonstrate some of the state-of-the-art architecture to detect synthesized components of media.

The tutorial planning is as follows:

  • First, we will discuss the most recent architectures for generating fake media content. Such as (a) DeepFake, (b) Face to Face.
  • Next, we will discuss the state of art detection methods for detecting the generative components in the media. We will start with the famous FaceForensics++ architecture, and then we will discuss the following architectures:
    • FakeCatcher
    • MesoNet
    • Photoresponse non-uniformity (PRNU)
    • Temporal-aware pipeline to automatically detect deep fake videos
  • We will discuss the deep fake detection challenge (https://ai.facebook.com/datasets/dfdc/ ).
    Also about the ongoing competition on Kaggle: https://www.kaggle.com/c/deepfake-detection-challenge/discussion/121223
  • Finally, incorporate analytical models for better understanding the possible intention and impact of such fake images/videos on the security, economy, and social fabric of a nation.
Mehul S Raval
Mehul S Raval is a Professor at the School of Engineering and Applied Science, Ahmedabad University. Earlier, he served Pandit Deendayal Petroleum University (PDPU), Ahmedabad University, Dhirubhai Ambani-Institute of Information and Communication Technology (DA-IICT), and Sarvajanik College of Engineering and Technology (SCET) Surat, India. He visited faculty at Sardar Vallabhbhai National Institute of Technology Surat (2005 - 2007) and Veer Narmad South Gujarat University (2004 - 2008), Surat, India. Mehul visited the Graduate School of Natural Science and Technology, Okayama University, Japan, under the Sakura science exchange program. He was an Argosy visiting Associate Professor at Olin College of Engineering, MA, the US, during Fall 2016.
He is an alumnus of the Electronics & Telecommunication Engineering Department, College of Engineering Pune (COEP), one of the oldest engineering institutes in Asia (established in 1854), and Savitribai Phule Pune University(SPPU) (formerly University of Pune). He obtained a Bachelor of Engineering - Electronics and Telecommunication (1996), Master of Engineering - Digital Systems (2002), and Ph.D. in Electronics and Telecommunication Engineering (2008) from SPPU. He has published extensively in journals, magazines, conferences, workshops on a national and international stage. Dr. Raval is an associate editor for IEEE Access. He serves as a technical program committee member for leading national and international conferences, workshops, symposiums. Dr. Raval reviews papers from IEEE, ACM, Springer, Elsevier, IET, and other leading publishers. He has received research funds from the Board of Research in Nuclear Science (BRNS), Department of Atomic Energy, Government of India, Department of Science and Technology, Government of India, and IEEE.
He is supervising Doctoral, M. Tech, and B. Tech students. He also serves on the Board of Studies (BoS) for various universities. He is a senior member of IEEE, Fellow of IETE, and Fellow of The Institution of Engineers (India). Dr. Raval served IEEE Gujarat section during 2008 - 2015, 2018 - 2020 as a Joint Secretary. He also served the IEEE signal processing society (SPS) chapter - IEEE Gujarat Section as vice-chair and exe-com member in 2014. Currently, he is serving IEEE Computational Intelligence Society Chapter - IEEE Gujarat Section as Chair (2021 - ).
Mohendra Roy
Mohendra Roy received his Ph.D. in Electronics and Information Engineering from Korea University, South Korea, in 2016. He did his master's in BioElectronics and physics from Tezpur University, India, in 2008 and 2006, respectively. Before his Ph.D., Dr. Roy worked in Indian Oil Corporation Limited as an Engineering Assistant. He was a postdoctoral research fellow at Delta-NTU corporate Lab of Nanyang Technological University Singapore from 2017 to 2019.
He is also playing the role of a reviewer of many reputed journals, such as Scientific Reports (by Nature), IEEE Sensors, Proceedings of IEEE, Optics Express, Journal of Tissue and cells, Biomedical Optics Express, Journal of mobile network and applications, MDPI Algorithms, MDPI Sensors. Dr. Roy also chaired the session for Computation Intelligence in Feature Analysis, Selection and Learning in Pattern Recognition tracks at the IEEE Symposium Series on Computational Intelligence 2018.
Currently, he is an Asst. Professor at School of Technology, Pandit Deendayal Energy University (PDEU), India. He is also an executive committee member and treasurer of IEEE Computational Intelligence Society, Gujarat Chapter, India.

+ T4: Toward the Realization of Automatic Spoken Language Acquisition Mechanism for Human-Symbiotic Robots

Takahiro Shinozaki
(Tokyo Institute of Technology, Japan)
Takuma Okamoto
(National Institute of Information and Communications Technology, Japan)
Shinsuke Mori
(Kyoto University, Japan)
Title
Toward the Realization of Automatic Spoken Language Acquisition Mechanism for Human-Symbiotic Robots
Presenters
Takahiro Shinozaki (Tokyo Institute of Technology, Japan), Takuma Okamoto (National Institute of Information and Communications Technology, Japan), Shinsuke Mori (Kyoto University, Japan)
Abstract

The process of spoken language acquisition has been one of the topics that have attracted the most significant interest of linguists and human scientists for decades. One prominent and widely accepted explanation, proposed by Skinner in 1957, states that children acquire language based on behaviorist reinforcement principles by associating words with meanings. Along with the advances in Artificial Intelligence (AI) and Machine Learning technologies, we now have the means to construct and investigate engineering models on computers. This tutorial aims to review related works scattered in several areas and depict the roadmap to make robots efficiently acquire any language from scratch through spoken interactions just as human babies. We first survey researches in cognitive science and add some consideration from engineering viewpoints. We classify existing studies of language learning systems, clarifying what they can and can not. We then give lectures on unsupervised and reinforcement learning algorithms that form the basis of developing language acquisition systems. Finally, we introduce a toolkit that provides the research platform for automatic spoken language acquisition.

Takahiro Shinozaki
Takahiro Shinozaki is an associate professor at the Tokyo Institute of Technology, Tokyo, Japan. He received his B.E., M.E., and Ph.D. degrees in computer science from the Tokyo Institute of Technology, Japan, in 1999, 2001, and 2004, respectively. From 2004 to 2006, he was a Research Scholar at the University of Washington, Seattle, USA. From 2006 to 2007, he was a Research Assistant Professor at Kyoto University, Kyoto, Japan. From 2007 to 2011, he was a fellowship researcher and an assistant professor at Tokyo Institute of Technology. From 2011 to 2013, he was an assistant professor at Chiba University, Chiba, Japan. His research interests include automatic spoken language acquisition, semi-supervised and unsupervised learning of spoken languages, black-box optimizations, and their applications. He received the Yamashita SIG Research Award from the Information Processing Society of Japan (IPSJ) in 2009 and the Awaya Prize from the Acoustical Society of Japan (ASJ) in 2008.
Takuma Okamoto
Takuma Okamoto received the B.E., the M.S., and the Ph.D. degrees from Tohoku University, Japan, in 2004, 2006, and 2009, respectively. He was a Research Fellow (DC2) of the Japan Society for the Promotion of Science from 2007 to 2009. From 2009, he was a postdoctoral research fellow at Tohoku University, Japan. During 2012 to 2020, he was a researcher at the National Institute of Information and Communications Technology (NICT), Japan, and he is currently a senior researcher there. His main research fields are sound field synthesis based on acoustic signal processing and speech synthesis based on neural networks. He received the 32nd Awaya Prize Young Researcher Award and the 57th Sato Prize Paper Award from the Acoustical Society of Japan (ASJ) in 2012 and 2017, respectively. He is a member of the Audio Engineering Society (AES) and ASJ.
Shinsuke Mori
Shinsuke Mori received his B.S., M.S., and Ph.D from Kyoto University in 1993, 1995, and 1998, respectively. After joining the Tokyo Research Laboratory of International Business Machines (IBM) in 1998, he studied language modeling and its application to speech recognition and language processing. He is currently a professor at the Academic Center for Computing and Media Studies, Kyoto University. His research interests include natural language processing, spoken language processing, human-computer interaction, and multimedia integration. He is a member of the Japan Association for Natural Language Processing, and the Information Processing Society of Japan, and the Database Society of Japan.

+ T5: Distributed Machine Learning over Networks: Basics, Applications, and Beyond of Federated Learning

Takayuki Nishio
(Tokyo Institute of Technology, Japan)
Akihito Taya
(Aoyama Gakuin University, Japan)
Title
Distributed Machine Learning over Networks: Basics, Applications, and Beyond of Federated Learning
Presenters
Takayuki Nishio (Tokyo Institute of Technology, Japan), Akihito Taya (Aoyama Gakuin University, Japan)
Abstract

Federated learning (FL) has emerged as a communication-efficient distributed machine learning that preserves user privacy. This tutorial explains basics of FL at first, and then introduces its applications such as mobile keyboard prediction, Internet of things (IoT) security, mobile edge computing, and vehicular communications. Finally, we present recent works aiming at communication-efficient, fully-distributed, and model-free FLs. One of the key ideas is knowledge distillation, which is a powerful model training technique.

Takayuki Nishio
Takayuki Nishio has been an associate professor in the School of Engineering, Tokyo Institute of Technology, Japan, since 2020. He received the B.E. degree in electrical and electronic engineering and the master’s and Ph.D. degrees in informatics from Kyoto University in 2010, 2012, and 2013, respectively. He had been an assistant professor in the Graduate School of Informatics, Kyoto University from 2013 to 2020. From 2016 to 2017, he was a visiting researcher in Wireless Information Network Laboratory (WINLAB), Rutgers University, United States. His current research interests include machine learning-based network control, machine learning in wireless networks, and heterogeneous resource management.
Akihito Taya
Akihito Taya has been an assistant professor of the Aoyama Gakuin University. He received the B.E. degree in electrical and electronic engineering from Kyoto University, Kyoto, Japan in 2011, and the master and Ph.D. degree in Informatics from Kyoto University in 2013 and 2019, respectively. He joined Hitachi, Ltd. from 2013 to 2017, where he participated in the development of high performance computing clusters. His current research interests include distributed machine learning over sensor networks.

+ T6: Advanced Topics of Prior-based Image Restoration: Tensors and Neural Networks

Tatsuya Yokota
(Nagoya Institute of Technology, Japan)
Title
Advanced Topics of Prior-based Image Restoration: Tensors and Neural Networks
Presenters
Tatsuya Yokota (Nagoya Institute of Technology, Japan)
Abstract

This tutorial introduces the prior-based image restoration methods. The problem of image restoration is to reconstruct an image from an observed coruppted image, such as noise, blurring, sampling, and pixel missing. Since the image restoration problem based only on the observation model is often an ill-posed problem, some methods based on the assumption of idmage priors are used.
This tutorial is divided into three parts. The first explains the basics of the image restoration problem and introduces the classical methods that have been used so far. The second introduces an image restoration method based on tensor representation. Finally, we will introduce an image restoration method that uses neural networks.

Tatsuya Yokota
Tatsuya Yokota received the Ph.D. degree in engineering from the Tokyo Institute of Technology, Tokyo, Japan, in 2014. From 2011 to 2014, he was a Junior Research Associate with the Laboratory for Advanced Brain Signal Processing (ABSP), RIKEN Brain Science Institute (BSI), Japan. From 2014 to 2016, he was a Research Scientist with the Laboratory for ABSP and a Visiting Research Scientist with the TOYOTA Collaboration Center, RIKEN BSI. He is currently an Associate Professor with the Department of Computer Science, Nagoya Institute of Technology, Japan, and a Visiting Research Scientist with the Tensor Learning Team in RIKEN AIP. He has published papers in leading conferences: ICASSP 2016, CVPR 2017, CVPR 2018, ICCV 2019 and AAAI 2020. His research interests include matrix/tensor factorizations and signal/image processing. He organized special sessions in APSIPA 2017 and APSIPA 2018, and a tutorial in APSIPA 2020. He has served as a PC member for main conferences in CVPR, ICCV, ECCV, NeurIPS, ICML, and ICASSP.