Last update: November 2023

Carlos Toshinori Ishi, PhD in Engineering

Speech Science and Technology Researcher

Group Leader of ATR/HIL Sound Environment Intelligence Research Group

Senior Scientist at RIKEN, Guardian Robot Project

Office Address

ATR – HIL (Hiroshi Ishiguro Laboratories)

2-2-2 Hikaridai, Seika-cho, Soraku-gun

Kyoto 619-0288, JAPAN

Phone: +81-774-95-2457 Fax: +81-774-95-1408

E-mail： carlos (at) atr (dot) jp

Academic background

Doctor course Oct. 1998 ~ Sep. 2001 (Japan)

University of Tokyo (Graduate School of Engineering – Dept. of Information and Communication Engineering)

PhD dissertation: "Japanese Prosody Analysis and its Application for Computer-Aided Language Learning (CALL) Systems". With the aim of constructing a CALL system that can detect pronunciation errors reliably, acoustic-prosodic features on linguistic features of Japanese, such as tokushuhaku (special morae), mora rhythm, accent and intonation, were investigated in the production and perception viewpoints.

Master course Jan. 1997 ~ Feb. 1998 (Brazil)

"Instituto Tecnológico de Aeronáutica" (Electronic Engineering – Dept. of Telecommunications)

MS thesis: "Analysis of Brazilian Portuguese Phonemes for Speech Recognition". Acoustic properties of Brazilian Portuguese phonemes were analyzed for automatically segmentation purposes. Neural networks were also implemented for discriminating some devoiced vowels frequent in phrase finals in Brazilian Portuguese.

College Jan. 1992 ~ Dec. 1996 (Brazil)

"Instituto Tecnológico de Aeronáutica" (Electronic Engineering)

BA thesis: "DSP-Implementation of an Isolated Word Speech Recognition System". A DTW-based algorithm using mel-cepstral coefficients as features was implemented in DSP Assembly language for recognition of isolated words.

Technical High-school Jan. 1987 ~ Dec. 1990 (Brazil)

"Colégio Industrial Liceu de Artes e Ofícios de São Paulo" (Electronic Technician)

Vocational background

RIKEN GRP (Guardian Robot Project)

July 2020 ~ Senior Scientist of RIKEN/GRP Interactive Robot Research Team

ATR/HIL (Hiroshi Ishiguro Laboratories)

Apr. 2015 ~ Group leader of ATR/HIL Sound Environment Intelligence Research Group

ATR/IRC (Intelligent Robotics and Communication) Labs.

Apr. 2013 ~ Mar. 2015 Group leader of ATR/IRC Sound Environment Intelligence Laboratory

Jan. 2005 ~ Mar. 2013 Researcher

Research on Speech Science and Technology for Verbal and Non-Verbal Communication in Human-Robot Interaction and Sound Environment Intelligence:

- Acoustic analysis of vocal fry in pressed voice (“rikimi”): Proposal of an algorithm for automatic detection of vocal fry.

- Use of prosodic and voice quality parameters for automatic extraction of paralinguistic information (speech acts, attitudes, emotions).

- Use of prosodic and linguistic cues for automatic detection of turn-taking and dialog acts.

- Evaluation of robust speech recognition system for communication robots in real environments (Joint work with ATR/SLC Labs.)

- Acoustic and electro-glottographic analysis of pressed voice and other voice qualities.

- Analysis of head motions, facial expressions and linguistic and paralinguistic information carried by speech.

- Head motion and emotional expressions (laughter, surprise, anger) in humanoid robots (androids)

- Analysis of hand gestures and linguistic and paralinguistic information carried by speech.

- Hand gesture motion generation in humanoid robots (androids)

- Sound source localization and utterance interval detection using microphone arrays; sound environment.

- Audio-visual speech interval detection.

- Integration of speech recognition and paralinguistic information extraction (speech act recognition).

- Robust F0 extraction.

- Speech-driven lip motion generation for teleoperation of humanoid robots.

- Sound map generation by integration of microphone arrays and laser range finders.

JST/CREST at ATR/HIS Labs.

Feb. 2002 ~ Dec. 2004

- Research in Speech Science and Technology for Expressive Speech Processing

- Acoustic-prosodic analysis of expressive speech: Principal Component Analysis on global acoustic features and impressions about emotional states, attitudes, and speaking styles.

- Analysis focused on pitch movements of phrase finals à Automatic identification of phrase final tones.

- Acoustical analysis of creaky voice à Automatic detection of creaky segments

- Acoustical analysis of breathy/whispery voices à Automatic detection of aspiration noise segments

- Development of algorithms for automatic speech utterance detection, algorithms for pitch extraction, software for prosodic labeling tools.

ITA-LASD

Jan. 1997 ~ Feb. 1998

Implementation of software in assembly for Digital Signal Processors

- ADPCM algorithms for audio compression, using ADSP-21XX processors

- FFT-based algorithms for telephone tone detection, using Motorola ESM processors

Matec

Jan. 1991 ~ Jan. 1992

Repair of broken Telephone exchange plaques, power supplier modules, and telephone devices.

Grants

- 科学研究費補助金挑戦的研究（開拓）（May 2022 ~ Mar. 2025）“アニメの声が喚起する「情動」をてがかりに声の分化的制度化を分野横断で捉える試み”（研究分担者；　代表者：太田一郎）

- 新学術領域研究（研究領域公募型）（Apr. 2022 ~ Mar. 2024）“対話ロボットにおける個人性を考慮した社会的表出の研究開発”（研究代表者）

- JST未来社会創造事業（探索加速型（探索研究））（Nov. 2020 ~ Mar. 2023）“表情からの感情センシングによるウェルビーイング向上”（研究分担者；　代表者：佐藤弥）

- 新学術領域研究（研究領域公募型）（Apr. 2020 ~ Mar. 2022）“対話ロボットにおける「社会的表出」の基盤技術の研究開発”（研究代表者）

- 立石財団研究助成（Ｓ）（Apr. 2019 ~ Aug. 2022）“環境センサと融和した注意指向・取捨選択型聴覚支援システムの研究開発”（研究代表者）

- JST未来社会創造事業（探索期間）（Nov. 2018 ~ Mar. 2020）“遠隔操作型対話ロボットによる知の質と量の向上”（研究分担者；　代表者：石黒浩）

- 科学研究費補助金基盤研究（Ｓ）（Apr. 2013 ~ Mar. 2018）“人のような存在感を持つ半自律遠隔操作型アンドロイドの研究”（研究分担者；　代表者：石黒浩）

- 科学研究費補助金基盤研究（Ａ）（Apr. 2013 ~ Mar. 2016）“人間関係を理解する対話ロボットの実現”（研究分担者；代表者：神田崇行）

- 総務省 SCOPE ICTイノベーション創出型研究開発（July 2015 ~ Mar. 2016）“音環境知能技術を活用した取捨選択型聴覚支援システムの研究開発”（研究代表者）

- 総務省 SCOPE ICTイノベーション創出型研究開発（July 2012 ~ Mar. 2015）“複数のマイクロホンアレイの連携による音環境知能技術の研究開発”（研究代表者）

- 科学研究費補助金若手研究（Ａ）（Apr. 2011 ~ Mar. 2014）“韻律・声質の動的特徴および形態素・品詞を考慮した発話意図認識システムの構築”（研究代表者）

- 科学研究費補助金若手研究（Ａ）（Apr. 2008 ~ Mar. 2011）“発話音声に伴う頭部動作および表情と言語・パラ言語情報との関連構造の構築”（研究代表者）

- 科学研究費補助金若手研究（Ａ）（Apr. 2006 ~ Mar. 2008）“韻律と声質を考慮した発話スタイルの検出機構の構築と実環境への適用”（研究代表者）

- 文部省国費留学生奨学金（Apr. 1998 ~ Sep. 2001）

Lectures

2021 ~

Visitant Professor （招へい教授）at Osaka University （大阪大学）： part of “知能システム学特論”

2021 ~

Visitant Professor （客員教授）at Kobe University （神戸大学）“マルチモーダル情報処理”（“Multi-modal information processing”）

2012 ~ 2020

Visitant Associate Professor （客員准教授）at Kobe University （神戸大学）： “マルチモーダル情報処理”（“Multi-modal information processing”）

2015 ~ 2019

Visitant Lecturer（非常勤講師）at Doshisha University（同志社大学）： part of “特別講義I”

2013 ~ 2014

Visitant Associate Professor （客員准教授）at Kyoto University （京都大学）： part of “知能情報学特殊研究1”

2008 ~ 2016

Visitant Lecturer（非常勤講師）at Osaka Prefecture University（大阪府立大学）： half of “Advanced Intelligent Media Processing” （“知能メディア処理特論”）

Language skills

- Native language: Brazilian Portuguese

- Second languages: Japanese, English.

Programming skills

- C++, Python, Matlab

- Visual C++, JAVA

- Assembly (Analog Devices ADSP-21XX, Motorola ESM, 386)

Research interests

Prosody and Voice quality:

- Analysis of laryngeal voice qualities: automatic detection of creaky voice; automatic detection of aspiration noise in breathy and whispery voices.

- Mapping between prosodic + voice quality features and linguistic and paralinguistic functions (intentions, emotions, and attitudes) in Japanese.

- Transcription of prosodic events: automatic extraction of perceptually meaningful prosodic events for automatic prosody labeling: focus on phrase final prosody and voice quality.

- Pitch perception: Correspondence between acoustically observed F0 and perceived pitch movements.

- Robust F0 extraction.

Speech and Gestures:

- Analysis of head motion and speech in spoken dialogue: automatic generation of head motions from speech.

- Multi-modal dialogue processing.

- Lip motion generation/synchronization for humanoid robots (including androids) based on speech acoustics.

- Head motion generation from speech acoustics and linguistic information.

- Facial expression and motion generation in humanoid robots (including androids) based on speech acoustics.

Robot Audition and Sound Environment Intelligence:

- Microphone array for audio source localization and separation.

- Improvement of speech recognition and understanding in noisy environments.

- Utterance interval detection based on sound directivity.

- Utterance interval detection based on audio-visual information.

- Sound environment map generation

Speech Perception and Recognition

- Auditory representation of speech signals: acoustic parameters related to auditory perception; masking functions.

- Prosodic modeling applied to recognition of linguistic and paralinguistic information

Speech Production and Synthesis

- Mapping between physiological and acoustic features for laryngeal voice quality control

- Prosodic control and voice quality control for Speech Synthesis

List of publications

Journal Papers

1. 新谷太健, 石井カルロス寿憲, 石黒浩. 「複数人対話における視線動作の解析および対話ロボットの視線動作生成による個性の表出」日本ロボット学会誌, 2023. (accepted)

2. B. Wu, C. Liu, C.T. Ishi, H. Ishiguro (2023). Extrovert or Introvert? GAN-Based Humanoid Upper-Body Gesture Generation for Different Impressions. International Journal of Social Robotics, 16 pages, Aug. 2023. doi: 10.1007/s12369-023-01051-8

3. C. Fu, C. Liu, C. T. Ishi, H. Ishiguro. An Improved CycleGAN-based Emotional Speech Conversion Model by Augmenting Receptive Field with Transformer. Speech Communication, vol. 144, pp. 110-121, Sep. 2022. doi: 10.1016/j.specom.2022.09.002

4. C. Fu, C. Liu, C. Ishi, H. Ishiguro. An Adversarial Training Based Speech Emotion Classifier with Isolated Gaussian Regularization. IEEE Transactions on Affective Computing, (14 pages), April 2022. doi: 10.1109/TAFFC.2022.3169091

5. 湯口、河野、石井、吉野、川西、中村、港、斉藤、美濃『ぶつくさ君：自身の外界認識と内部状態を言語化するロボット』日本ロボット学会誌, Vol.40, No.10, pp.932-935, 2022. doi: 10.7210/jrsj.40.932

6. W. Sato, S. Namba, D. Yang, S. Nishida, C. Ishi and T. Minato (2022). An Android for Emotional Interaction: Spatiotemporal Validation of Its Facial Expressions. Frontiers in Psychology, 12, Article 800657, pp 1-12, Feb. 2022. doi: 10.3389/fpsyg.2021.800657

7. C. A. Ajibo, C. T. Ishi and H. Ishiguro (2021). Advocating Attitudinal Change Through Android Robot's Intention-Based Expressive Behaviors: Toward WHO COVID-19 Guidelines Adherence. IEEE Robotics and Automation Letters 6(4), 6521-6528, Oct. 2021, doi: 10.1109/LRA.2021.3094783

8. C. Fu, C. Liu, C. T. Ishi, Y. Yoshikawa, T. Iio and H. Ishiguro (2021). Using an Android Robot to Improve Social Connectedness by Sharing Recent Experiences of Group Members in Human-Robot Conversations. IEEE Robotics and Automation Letters 6(4), 6670-6677, Oct. 2021, doi: 10.1109/LRA.2021.3094779

9. J. Shi, C. Liu, C. Ishi, H. Ishiguro (2021). 3D skeletal movement-enhanced emotion recognition networks. APSIPA Transactions on Signal and Information Processing 10(E12), 1-12. doi:10.1017/ATSIP.2021.11

10. K. Maehama, J. Even, C.T. Ishi, T. Kanda (2021). Enabling Robots to Distinguish Between Aggressive and Joking Attitudes. IEEE Robotics and Automation Letters 6(4): 8037-8044, Oct. 2021. doi: 10.1109/LRA.2021.3102974

11. 李歆玥、石井カルロス寿憲、林良子 (2021). 日本語・中国語態度音声の音響分析及び声質分析 ―日本語母語話者及び中国語を母語とする日本語学習者を対象に―, 日本音響学会誌 77 巻 2 号，pp. 112–119. doi: 10.20697/jasj.77.2 112

12. 李歆玥、石井カルロス寿憲、林良子 (2021). 日本語と中国語感情音声に関する声質と音響の複合的分析 ―日本語母語話者と中国語を母語とする日本語学習者による発話を対象に―, 音声研究第25巻, 9–22, Apr. 2021. doi: 10.24467/onseikenkyu.25.0_9

13. B. Wu, C. Liu, C.T. Ishi, H. Ishiguro (2021). Modeling the Conditional Distribution of Co-Speech Upper Body Gesture Jointly Using Conditional-GAN and Unrolled-GAN. Electronics 10 (228), 1-15, Jan. 2021. doi: 10.3390/electronics10030228

14. J. Shi, C. Liu, C.T. Ishi, H. Ishiguro (2020). Multi-modality Emotion Recognition Model with GAT-based Multi-head Inter-modality Attention. Sensors 21(1), 205, 1-16, Jan. 2021. doi: 10.3390/s20174894

15. C.A. Ajibo, C.T. Ishi, R. Mikata, C. Liu & H. Ishiguro (2020). Analysis of body gestures in anger expression and evaluation in android robot, Advanced Robotics, 34:24, 1581-1590, Dec. 2020. doi: 10.1080/01691864.2020.1855244

16. C.T. Ishi, R. Mikata, H. Ishiguro (2020). Person-directed pointing gestures and inter-personal relationship: Expression of politeness to friendliness by android robots. IEEE Robotics and Automation Letters 5(4), 6081-6088, Oct. 2020. doi: 10.1109/LRA.2020.3011354

17. C. Fu, C. Liu, C.T. Ishi, H. Ishiguro (2020). Multi-modality Emotion Recognition Model with GAT-based Multi-head Inter-modality Attention. Sensors, 20(17), no. 4894, 1-15, Aug. 2020. doi: 10.3390/s20174894

18. Y. Li, C.T. Ishi, K. Inoue, S. Nakamura & T. Kawahara (2019). Expressing reactive emotion based on multimodal emotion recognition for natural conversation in human–robot interaction, Advanced Robotics, 33:20, 1030-1041, Sep. 2019.

19. C. Liu, C. Ishi, H. Ishiguro (2019). Probabilistic nod generation model based on speech and estimated utterance categories. Advanced Robotics, 33:15-16, 731-741, April 2019.

20. C. Liu, C. Ishi, H. Ishiguro (2019). Auditory scene reproduction for tele-operated robot systems. Advanced Robotics 33(7-8), 415-423, April 2019.

21. 劉超然, 石井カルロス, 石黒浩 (2019). 言語・韻律情報及び対話履歴を用いたLSTMベースのターンテイキング推定. 人工知能学会論文誌 34 巻 (2019) 2 号, p. C-I65_1-9.

22. C.T. Ishi, T. Minato and H. Ishiguro (2019). Analysis and generation of laughter motions, and evaluation in an android robot. APSIPA Transactions on Signal and Information Processing 8(e6), 1-10, Jan. 2019.

23. C.T. Ishi, D. Machiyashiki, R. Mikata, H. Ishiguro (2018). A speech-driven hand gesture generation method and evaluation in android robots. IEEE Robotics and Automation Letters 3 (4), 3757 - 3764, July 2018.

24. C.T. Ishi, C. Liu, J. Even, N. Hagita (2018). A sound-selective hearing support system using environment sensor network. Acoustical Science and Technology 39(4), 287-294, July 2018.

25. C.T. Ishi, J. Arai (2018). Periodicity, spectral and electroglottographic analyses of pressed voice in expressive speech. Acoustical Science and Technology 39(2), 101-108, Mar. 2018.

26. L.C. Cantor-Cutiva, P. Bottalico, C.T. Ishi, E.J. Hunter (2018). Vocal fry and vowel height in simulated room acoustics. Folia Phoniatrica et Logopaedica 2017; 69(3), 118-124, Jan. 2018.

27. K. Sakai, T. Minato, C.T. Ishi, and H. Ishiguro (2017). Novel Speech Motion Generation by Modelling Dynamics of Human Speech Production, Frontiers in Robotics and AI, Vol.4, Article 49, 14 pages, Oct. 2017.

28. C.T. Ishi, T. Minato, H. Ishiguro (2017). Motion analysis in vocalized surprise expressions and motion generation in android robots. IEEE Robotics and Automation Letters, Vol.2, No.3, 1748 - 1754, July 2017.

29. 波多野博顕、石井カルロス寿憲 (2017). 日本語自然対話に現れる質問発話の句末音調. 音声研究, Vol.21, No.1, pp. 1-11, Apr. 2017.

30. J. Even, J. Furrer, L.Y. Morales Saiki, C.T. Ishi, N. Hagita. (2017). "Probabilistic 3D mapping of sound-emitting structures based on acoustic ray casting," IEEE Transactions on Robotics (T-RO) Vol.33, No.2, 333-345, Apr. 2017.

31. 船山智，港隆史，石井カルロス寿憲，石黒浩 (2017)．"操作者の笑い声に基づく遠隔操作型アンドロイドの笑い動作生成"，情報処理学会論文誌, Vol.58, No.4, 932-944, Apr. 2017.

32. 境くりま，港隆史，石井カルロス寿憲，石黒浩 (2017)．"わずかな感情変化を表現可能なアンドロイド動作の生成モデルの提案", 電子情報通信学会論文誌 D, Vol.J100-D, No.3, 310-320, Mar. 2017.

33. 石井カルロス寿憲, エヴァン・イアニ, 萩田紀博 (2016). 複数のマイクロホンアレイによる音源方向情報と人位置情報に基づく音声区間検出および顔の向きの推定の評価, 日本ロボット学会誌，Vol.34 No.3, pp 39-44, April 2016.

34. 境くりま, 石井カルロス寿憲, 港隆史, 石黒浩 (2016). 音声に対応する頭部動作のオンライン生成システムと遠隔操作における効果, 電子情報通信学会和文論文誌A，Vol. J99-A No.1, pp. 14-24, Jan. 2016.

35. 石井カルロス寿憲 (2015). 人とロボットのコミュニケーションにおける非言語情報の表出－発話に伴う自然な頭部動作生成に焦点を当てて－, 感性工学, Vol. 13, No. 4, pp. 205-210, Dec. 2015.（解説論文）

36. 石井カルロス寿憲 (2015). 音声対話中に出現するパラ言語情報と音響関連量―声質の役割に焦点を当てて ―, 日本音響学会誌, Vol. 71, No. 9, pp. 476-483, Sep. 2015. （解説論文）

37. 渡辺敦志, エヴァン・イアニ, モラレス・ルイス洋一, 石井カルロス寿憲 (2015). 人間協調型移動ロボットによるコンクリート打音検査記録システム, 日本ロボット学会誌, Vol. 33, No. 7, 68-74, Sep. 2015.

38. Ishi, C., Even, J., Hagita, N. (2014). Integration of multiple microphone arrays and use of sound reflections for 3D localization of sound sources. IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences, Vol.E97-A, No.9, pp.1867-1874, Sep. 2014.

39. Ishi, C., Ishiguro, H., Hagita, N. (2013). Analysis of relationship between head motion events and speech in dialogue conversations. Speech Communication 57, No.2014, 233–243, June 2013.

40. 石井カルロス寿憲, 劉超然, 石黒浩, 萩田紀博 (2013). 遠隔存在感ロボットのためのフォルマントによる口唇動作生成手法, 日本ロボット学会誌, Vol. 31, No. 4, 401-408, May 2013.

41. 劉超然, 石井カルロス寿憲, 石黒浩, 萩田紀博 (2013). 人型コミュニケーションロボットのための首傾げ生成手法の提案および評価, 人工知能学会論文誌, vol. 28, no. 2, pp. 112-121, January, 2013.

42. Liu, C., Ishi, C., Ishiguro, H., Hagita, N. (2013). Generation of nodding, head tilting and gazing for human-robot speech interaction. International Journal of Humanoid Robotics (IJHR), vol. 10, no. 1, January, 2013.

43. P. Heracleous, M. Sato, C. T. Ishi, and N. Hagita. (2013) Analysis of the visual Lombard effect and automatic recognition experiments. Computer Speech and Language 27(1), 288-300, 2013.

44. P. Heracleous, C.T. Ishi, T. Miyashita, H. Ishiguro and N. Hagita (2013). Using body-conducted acoustic sensors for human-robot communication in noisy environments. International Journal of Advanced Robotic Systems 10(136), pp 1-7, Feb. 2013.

45. Becker-Asano, C., Kanda, T., Ishi, C., and Ishiguro, H. (2011). Studying laughter combined with two humanoid robots. AI & Society, Vol. 26 (3), pp. 291-300, 2011.

46. M. Shiomi, D. Sakamoto, T. Kanda, C.T. Ishi, H. Ishiguro, N. Hagita (2011). Field trial of a networked robot at a train station. International Journal of Social Robotics 3(1), 27-40, Jan. 2011.

47. 石井カルロス寿憲 (2010). ATRのコミュニケーションロボットにおける聴覚および音声理解に関する研究課題, 日本ロボット学会誌, Vol. 28, No. 1, pp. 27-30, Jan.2010. （解説論文）

48. Ishi, C.T., Ishiguro, H., Hagita, N. (2010). Analysis of the roles and the dynamics of breathy and whispery voice qualities in dialogue speech. EURASIP Journal on Audio, Speech, and Music Processing 2010, ID 528193, 1-12 Jan. 2010.

49. 塩見昌裕，坂本大介, 神田崇行，石井カルロス寿憲，石黒浩，萩田紀博 (2009). 半自律型コミュニケーションロボットの開発, 電子情報通信学会論文誌, 人とエージェントのインタラクション特集号, pp.773-783, 2009.

50. Ishi, C.T., Ishiguro, H., Hagita, N. (2008). Automatic extraction of paralinguistic information using prosodic features related to F0, duration and voice quality. Speech Communication 50(6), 531-543, June 2008.

51. Ishi, C.T., Matsuda, S., Kanda, T., Jitsuhiro, T., Ishiguro, H., Nakamura, S., Hagita, N. (2008). A robust speech recognition system for communication robots in noisy environments. IEEE Transactions on Robotics, Vol. 24, No. 3, 759-763, June 2008.

52. Ishi, C.T., Sakakibara, K-I., Ishiguro, H., Hagita, N. (2008). A method for automatic detection of vocal fry. IEEE Transactions on Audio, Speech and Language Processing, Vol. 16, No. 1, 47-56, Jan. 2008.

53. Ishi, C.T. (2006), The functions of phrase final tones in Japanese: Focus on turn-taking. Journal of Phonetic Society of Japan, Vol. 10 No.3, 18-28, Dec. 2006.

54. 石井カルロス寿憲，榊原健一，石黒浩，萩田紀博 (2006) Vocal Fry発声の自動検出法. 電子情報通信学会論文誌ＤVol. J89-D, No. 12, 2679-2687, Dec. 2006.

55. 石井カルロス寿憲，石黒浩，萩田紀博 (2006) 韻律および声質を表現した音響特徴と対話音声におけるパラ言語情報の知覚との関連. 情報処理学会論文誌Vol. 47, No. 6, 1782-1793, June 2006.

56. Ishi, C.T. (2005) Perceptually-related F0 parameters for automatic classification of phrase final tones. IEICE Trans. Inf. & Syst., Vol. E88-D, No. 3, 481-488

57. Ishi, C.T. (2004). “Analysis of autocorrelation-based parameters in creaky voice,” Acoustical Science and Technology, Vol. 25, No. 4, 299-302.

58. Ishi, C.T., Hirose, K. & Minematsu, N. (2003). Mora F0 representation for accent type identification in continuous speech and considerations on its relation with perceived pitch values. Speech Communication, Vol. 41, Nos. 2-3, 441-453

PhD dissertation

Ishi, C.T. (2001). “Japanese prosody analysis and its applications to Computer-Aided Language Learning systems,” PhD dissertation, University of Tokyo, Sep. 2001.

Book chapters

1. C.T. Ishi, Motion Generation during Vocalized Emotional Expressions and Evaluation in Android Robots, In Becoming Human with Humanoid - From Physical Interaction to Social Intelligence, Intech Open, A.H. Basori (Eds.), pp. 1-20, Mar. 2020.

2. C.T. Ishi, C. Liu, H. Ishiguro, N. Hagita, Formant-Based Lip Motion Generation and Evaluation in Humanoid Robots, In Geminoid Studies - Science and Technologies for Humanlike Teleoperated Androids, Springer, H. Ishiguro, F. Dalla Libera (Eds.), pp. 75-88, 2018.

3. C.T. Ishi, H. Ishiguro, N. Hagita, Analysis of Head Motions and Speech, and Head Motion Control in an Android Robot, In Geminoid Studies - Science and Technologies for Humanlike Teleoperated Androids, Springer, H. Ishiguro, F. Dalla Libera (Eds.), pp. 89-110, 2018.

4. C.T. Ishi, C. Liu, H. Ishiguro, Generation of Head Motion During Dialogue Speech, and Evaluation in Humanoid Robots, In Geminoid Studies - Science and Technologies for Humanlike Teleoperated Androids, Springer, H. Ishiguro, F. Dalla Libera (Eds.), pp. 111-135, 2018.

5. C.T. Ishi, Recognition of Paralinguistic Information using Prosodic Features Related to Intonation and Voice Quality, In Speech Recognition - Technologies and Applications, In-teh, F. Mihelic and J. Zibert (Eds.), pp. 377-394, Nov. 2008.

Invited talks （招待講演）

1. C.T. Ishi, “Multimodal speech processing for dialogue robots and applications of sound environment intelligence,” RO-MAN 2023 - Workshop on Speech-based communication for robots and systems, 2023/8/28.

2. 石井カルロス寿憲、「マルチモーダル音声情報処理と対話ロボットへの応用」，富山県立大学、2023/2/3

3. C.T. Ishi, "Analysis and generation of speech-related motions, and evaluation in humanoid robots", ICMI GENEA (Generation and Evaluation of Non-verbal Behaviour for Embodied Agents) Workshop 2022, Nov. 2022.

4. 石井カルロス寿憲、「声質の科学：音響特徴、EGG特性およびパラ言語的機能」、日本音響学会音声研究会 (ASJ-SP)、2022/1/30

5. 石井カルロス寿憲、「発話音声に伴う対話ロボットの動作生成に関する取組み」、琉球大学、2021/7/21

6. 石井カルロス寿憲、「音環境知能技術とコミュニケーションロボットおよび聴覚支援への応用」、Panasonic社、2021/7/16

7. 石井カルロス寿憲,「対話音声に伴うパラ言語・非言語情報の抽出および表出」，日本音響学会2019年春季研究発表会, 2019/3/5

International Conference Papers (refereed)

1. H. Guo, C. Liu, C.T. Ishi, H. Ishiguro. “QuickVC: a lightweight VITS-based any-to-many voice conversion model using ISTFT for faster conversion,” Proc. ASRU2023, Dec. 2023 (accepted).

2. H. Guo, C. Liu, C.T. Ishi, H. Ishiguro. “Using joint training speaker encoder with consistency loss to achieve cross-lingual voice conversion and expressive voice conversion,” Proc. ASRU2023, Dec. 2023 (accepted).

3. J. Shi, C. Liu, C.T. Ishi, B. Wu, H. Ishiguro, "Recognizing Real-World Intentions using A Multimodal Deep Learning Approach with Spatial-Temporal Graph Convolutional Networks", Proc. IROS 2023, Oct., 2023.

4. C. Ishi, A. Utsugi, I. Ota, “Voice types and voice quality in Japanese anime,” Proc. 20th International Congress of Phonetic Sciences (ICPhS23), pp.3632-3636, Aug. 2023.

5. K. Wang, C. Ishi, R. Hayashi, “Laughter patterns in multi-speaker conversation data: comparison between spontaneous laughter and intentional laughter,” Proc. 20th International Congress of Phonetic Sciences (ICPhS23), pp.1771-1775, Aug. 2023.

6. C.T. Ishi, C. Liu, T. Minato, “An attention-based sound selective hearing support system: evaluation by subjects with age-related hearing loss,” Proc. IEEE/SICE Int. Symposium on System Integration (SII 2023), Jan. 2023.

7. C. Liu, C.T. Ishi, “A Smartphone Pose Auto-calibration Method using Hash-based DOA Estimation,” Proc. IEEE/SICE Int. Symposium on System Integration (SII 2023), Jan. 2023.

8. C. Fu, C. Liu, C. T. Ishi, H. Ishiguro, “C-CycleTransGAN: A Non-Parallel Controllable Cross-Gender Voice Conversion Model With CycleGAN and Transformer,” Proc. APSIPA 2022, pp. 1-7, Nov. 2022.

9. B. Wu, J. Shi, C. Liu, C.T. Ishi, H. Ishiguro. "Controlling the Impression of Robots via GAN-based Gesture Generation," Proc. IROS22, pp. 9288-9295, Oct. 2022.

10. T. Shintani, C.T. Ishi, H. Ishiguro. “Expression of Personality by Gaze Movements of an Android Robot in Multi-Party Dialogues,” Proc. RO-MAN22, Sep. 2022, pp. 1534-1541.

11. X. Li, C.T. Ishi, C. Fu, R. Hayashi, “Prosodic and Voice Quality Analyses of Filled Pauses in Japanese Spontaneous Conversation by Chinese learners and Japanese Native Speakers,” Proc. Speech Prosody 2022, May 2022, pp. 550-554. doi: 10.21437/SpeechProsody.2022-112

12. T. Shintani, C.T. Ishi, H. Ishiguro. “Analysis of Role-Based Gaze Behaviors and Gaze Aversions, and Implementation of Robot's Gaze Control for Multi-party Dialogue,” Proc. 9th International Conference on Human-Agent Interaction (HAI 2021), Nov. 2021, pp. 332-336.

13. B. Wu, C. Liu, C.T. Ishi, H. Ishiguro. "Probabilistic Human-like Gesture Synthesis from Speech using GRU-based WGAN," Proc. ICMI '21 Companion: Companion Publication of the 2021 International Conference on Multimodal Interaction, Oct. 2021, pp. 194–201 (2021)

14. C. A. Ajibo, C. T. Ishi and H. Ishiguro, "Advocating Attitudinal Change Through Android Robot's Intention-Based Expressive Behaviors: Toward WHO COVID-19 Guidelines Adherence," in Proc. IROS2021, Oct. 2021.

15. C. Fu, C. Liu, C. T. Ishi, Y. Yoshikawa, T. Iio and H. Ishiguro, "Using an Android Robot to Improve Social Connectedness by Sharing Recent Experiences of Group Members in Human-Robot Conversations," in Proc. IROS2021, Oct. 2021.

16. C.T. Ishi, T. Shintani (2021). “Analysis of eye gaze reasons and gaze aversions during three-party conversations,” Proc. Interspeech 2021, pp. 1972-1976, Sep. 2021.

17. C. Fu, C. Liu, C. T. Ishi and H. Ishiguro, "MAEC: Multi-Instance Learning with an Adversarial Auto-Encoder-Based Classifier for Speech Emotion Recognition," ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), June, 2021, pp. 6299-6303.

18. J. Shi, C. Liu, C.T. Ishi, H. Ishiguro, "3D Skeletal Movement Enhanced Emotion Recognition Network", In Asia-Pacific Signal and Information Processing Association (Annual Summit and Conference 2020) (APSIPA), Auckland, New Zealand (Virtual), pp. 1060-1066, December, 2020. (Best paper award)

19. C.T. Ishi, R. Mikata, H. Ishiguro "Person-directed pointing gestures and inter-personal relationship: Expression of politeness to friendliness by android robots", In International Conference on Intelligent Robots and Systems (IROS 2020), Las Vegas, USA (Virtual), October, 2020.

20. C. Fu, J. Shi, C. Liu, C.T. Ishi, H. Ishiguro, "AAEC: An Adversarial Autoencoder-based Classifier for Audio Emotion Recognition", In MuSe 2020-The Multimodal Sentiment in Real-life Media Challenge (Conference: ACM Multimedia Conference 2020), Seattle, United States (Virtual), pp. 45-51, October, 2020.

21. C. Ishi, R. Mikata, H. Ishiguro (2020) “Analysis of the factors involved in person-directed pointing gestures in dialogue speech,” In Speech Prosody 2020, Tokyo, Japan (Virtual), pp. 309-313, May, 2020.

22. X. Li, C. Ishi, R. Hayashi (2020) “Prosodic and Voice Quality Feature of Japanese Speech Conveying Attitudes: Mandarin Chinese Learners and Japanese Native Speakers,” In Speech Prosody 2020, Tokyo, Japan (Virtual), pp. 41-45, May, 2020.

23. A.C. Augustine, R. Mikata, C. Liu, C. Ishi, H. Ishiguro “Generation and Evaluation of Audio-Visual Anger Emotional Expression of Android Robot,” HRI2020 LBR (Virtual), Mar. 2020.

24. C. Fu, C. Liu, C. Ishi, Y. Yoshikawa, H. Ishiguro (2020) “SeMemNN: A Semantic Matrix Based Memory Neural Network for Text Classification,” In 14th IEEE International Conference on Semantic Computing (ICSC 2020), San Diego, California, USA, pp. 123-127, February, 2020.

25. C.T. Ishi, A. Utsumi, S. Nagasawa (2020). “Analysis of sound activities and voice activity detection using in-car microphone arrays,” Proc. of the 2020 IEEE/SICE International Symposium on System Integration (SII 2020), pp. 640-645, Jan. 2020.

26. R. Mikata, C.T. Ishi, T. Minato, H. Ishiguro (2019). “Analysis of factors influencing the impression of speaker individuality in android robots,” Proc. of The 28th IEEE International Conference on Robot and Human Interactive Communication (RO-MAN2019), New Delhi, India, pp. 1224-1229, October, 2019.

27. C.T. Ishi, R. Mikata, T. Minato, H. Ishiguro (2019). “Online processing for speech-driven gesture motion generation in android robots,” Proc. of The 2019 IEEE-RAS International Conference on Humanoid Robots (Humanoids 2019), Toronto, Canada, pp. 508-514, October, 2019.

28. C. Ishi, T. Kanda (2019). “Prosodic and voice quality analyses of loud speech: differences of hot anger and far-directed speech,” In Speech, Music and Mind 2019 (SMM 2019), Satellite Workshop of Interspeech 2019, pp. 1-5, Sep. 2019.

29. C. Liu, C.T. Ishi, H. Ishiguro (2019). “A neural turn-taking model without RNN,” Interspeech2019, pp. 4150-4154, Sep. 2019.

30. C.T. Ishi and T. Kanda (2019). “Prosodic and voice quality analyses of offensive speech,” Proc. of International Congress of Phonetic Sciences (ICPhS 2019), Melbourne, Australia, 2174-2178, Aug. 2019.

31. X. Li, A.L. Albin, C.T. Ishi, R. Hayashi (2019). “Japanese emotional speech produced by Chinese learners and Japanese native speakers: differences in perception and voice quality,” Proc. of International Congress of Phonetic Sciences (ICPhS 2019), Melbourne, Australia, 2183-2187, Aug. 2019.

32. S. Nakamura, C.T. Ishi, T. Kawahara (2019). “Prosodic characteristics of Japanese newscaster speech for different speaking situations,” Proc. of International Congress of Phonetic Sciences (ICPhS 2019), Melbourne, Australia, 3393-3397, Aug. 2019.

33. C.T. Ishi, D. Machiyashiki, R. Mikata, H. Ishiguro. (2018). “A speech-driven hand gesture generation method and evaluation in android robots,” IEEE Robotics and Automation Letters (RAL) paper presented at the 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2018), 8 pages, Oct. 2018.

34. C.T. Ishi, R. Mikata, H. Ishiguro. (2018). “Analysis of relations between hand gestures and dialogue act categories,” Proc. 9th International Conference on Speech Prosody 2018, 473-477, June 2018.

35. C.T. Ishi, T. Minato, and H. Ishiguro (2017). "Motion analysis in vocalized surprise expressions and motion generation in android robots," IEEE Robotics and Automation Letters (RAL) paper presented at the 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2017), 7 pages, Sep. 2017.

36. C. Liu, C. Ishi, and H. Ishiguro (2017). "Probabilistic nod generation model based on estimated utterance categories," Proc. 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2017), pp. 5333-5339, Sep. 2017.

37. Ishi, C., Arai, J., Hagita, N. (2017). "Prosodic analysis of attention-drawing speech," Proc. Interspeech 2017, pp. 909-913, Aug. 2017.

38. Ishi, C., Minato, T., Ishiguro, H. (2017). "Motion analysis in vocalized surprise expressions," Proc. Interspeech 2017, pp. 874-878, Aug. 2017.

39. Liu, C., Ishi, C., Ishiguro, H. (2017). "Turn-Taking Estimation Model Based on Joint Embedding of Lexical and Prosodic Contents," Proc. Interspeech 2017, pp. 1686-1690, Aug. 2017.

40. Ishi, C., Liu, C., Even, J., Hagita, N. (2016). “Hearing support system using environment sensor network,” IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2016), pp. 1275-1280, Oct., 2016.

41. Ishi, C., Funayama, T., Minato, T., Ishiguro, H. (2016). “Motion generation in android robots during laughing speech,” IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2016), pp. 3327-3332, Oct., 2016.

42. Ishi, C., Hatano, H., Ishiguro, H. (2016). “Audiovisual analysis of relations between laughter types and laughter motions,” Proc. of the 8th international conference on Speech Prosody (Speech Prosody 2016), pp. 806-810, May, 2016.

43. Hatano, H., Ishi, C., Komatsubara, T., Shiomi, M., Kanda, T. (2016). “Analysis of laughter events and social status of children in classrooms,” Proc. of the 8th international conference on Speech Prosody (Speech Prosody 2016), pp. 1004-1008, May, 2016.

44. K. Sakai, T. Minato, C.T. Ishi, and H. Ishiguro, “Speech driven trunk motion generating system based on physical constraint,” Proc. of 25th IEEE International Symposium on Robot and Human Interactive Communication (RO-MAN 2016), pp. 232-239, Aug. 2016.

45. D.F. Glas, T. Minato, C.T. Ishi, T. Kawahara, and H. Ishiguro, “ERICA: The ERATO Intelligent Conversational Android,” Proc. of 25th IEEE International Symposium on Robot and Human Interactive Communication (RO-MAN 2016), pp. 22-29, Aug. 2016.

46. Ishi, C., Even, J., Hagita, N. (2015). “Speech activity detection and face orientation estimation using multiple microphone arrays and human position information,” IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2015), pp. 5574-5579, Sep., 2015.

47. J. Even, F. Ferreri, A. Watanabe, Y. Morales, C. Ishi and N. Hagita (2015). “Audio augmented point clouds for applications in robotics,” IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2015), pp. 4846-4851, Sep., 2015.

48. A. Watanabe, J. Even, L.Y. Morales and C. Ishi (2015). “Robot-assisted acoustic inspection of infrastructures - Cooperative hammer sounding inspection -,” IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2015), pp. 5942-5947, Sep., 2015.

49. K. Sakai, C.T. Ishi, T. Minato, H. Ishiguro (2015) ''Online Speech-Driven Head Motion Generating System and Evaluation on a Tele-Operated Robot", In The 24rd IEEE International Symposium on Robot and Human Interactive Communication (RO-MAN 2015), Kobe, Hyogo, Japan, pp. 529-534, August, 2015.

50. Liu, C., Ishi, C.T., Ishiguro, H. (2015) “Bringing the scene back to the tele-operator: auditory scene manipulation for tele-presence systems,” In Proc. of ACM/IEEE International Conference on Human Robot Interaction (HRI 2015). Portland, USA. 279-286, March, 2015.

51. Ishi, C., Hatano, H., Hagita, N. (2014) "Analysis of laughter events in real science classes by using multiple environment sensor data," Proc. of 15th Annual Conference of the International Speech Communication Association (Interspeech 2014), pp. 1043-1047, Sep. 2014.

52. J. Even, L. Y. Morales, N. Kallakuri, J. Furrer, C. Ishi, N. Hagita (2014) “Mapping sound emitting structures in 3D”, The 2014 IEEE International Conference on Robotics and Automation (ICRA 2014), June, 2014.

53. Ishi, C., Hatano, H., and Kiso, M. (2014). “Acoustic-prosodic and paralinguistic analyses of “uun” and “unun”,” Proc. of the 7th international conference on Speech Prosody 2014, pp. 100-104, May, 2014.

54. Hatano, H., Kiso, M., and Ishi, C. (2014). “Interpersonal factors affecting tones of question-type utterances in Japanese,” Proc. of the 7th international conference on Speech Prosody 2014, pp. 997-1001, May, 2014.

55. Ishi, C., Even, J., Hagita, N. (2013). “Using multiple microphone arrays and reflections for 3D localization of sound sources,” IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2013), pp. 3937-3942, Nov., 2013.

56. N. Kallakuri, J. Even, L. Y. Morales, C. Ishi, N. Hagita (2013). “Using Sound Reflections to Detect Moving Entities Out of the Field of View”, IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2013), pp. 5201-5206, Nov., 2013.

57. J. Even, N. Kallakuri, L. Y. Morales, C. Ishi, N. Hagita (2013). “Creation of Radiated Sound Intensity Maps Using Multi-Modal Measurements Onboard an Autonomous Mobile Platform”, IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2013), pp. 3433-3438, Nov., 2013.

58. Hatano, H., Kiso, M., and Ishi, C. (2013) “Analysis of factors involved in the choice of rising or non-rising intonation in question utterances appearing in conversational speech,” Proc. 14th Annual Conference of the International Speech Communication Association (Interspeech 2013), 2564-2568, August, 2013.

59. Kallakuri, N., Even, J., Morales, Y., Ishi, C., Hagita, N. (2013) “Probabilistic Approach for Building Auditory Maps with a Mobile Microphone Array,” The 2013 IEEE International Conference on Robotics and Automation (ICRA 2013), pp. 2270-2275, May, 2013.

60. Ishi, C., Liu, C., Ishiguro, H. and Hagita, N. (2012). “Evaluation of formant-based lip motion generation in tele-operated humanoid robots,” In IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2012), Vilamoura, Algarve, Portugal, pp. 2377-2382, October, 2012.

61. Ishi, C., Liu, C., Ishiguro, H. and Hagita, N. (2012). “Evaluation of a formant-based speech-driven lip motion generation,” In 13th Annual Conference of the International Speech Communication Association (Interspeech 2012), Portland, Oregon, pp. P1a.04, September, 2012.

62. Ishi, C.T., Hatano, H., Hagita, N. (2012) “Extraction of paralinguistic information carried by mono-syllabic interjections in Japanese,” Proceedings of The 6^th International Conference on Speech Prosody (Speech Prosody 2012), 681-684.

63. Liu, C., Ishi, C., Ishiguro, H., Hagita, N. (2012) “Generation of nodding, head tilting and eye gazing for human-robot dialogue interaction,” Proceedings of 7^th ACM/IEEE International Conference on Human-Robot Interaction (HRI2012), 285-292.

64. Heracleous, P., Even, J., Ishi, C.T., Miyashita, T., Hagita, N. (2011). “Fusion of standard and alternative acoustic sensors for robust speech recognition,” Proc. ICASSP2012, .

65. Ishi, C., Dong, L., Ishiguro, H., and Hagita, N. (2011). “The effects of microphone array processing on pitch extraction in real noisy environments,” Proc. of IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2011), 550-555.

66. Ishi, C., Liu, C., Ishiguro, H. and Hagita, N. (2011). “Speech-driven lip motion generation for tele-operated humanoid robots,” Proceedings of International Conference on Auditory-Visual Speech Processing (AVSP2011), 131-135.

67. Ishi, C.T., Ishiguro, H., and Hagita, N. (2011). “Analysis of acoustic-prosodic features related to paralinguistic information carried by interjections in dialogue speech,” Proceedings of The 12th Annual Conference of the International Speech Communication Association (Interspeech’ 2011), 3133-3136.

68. Ishi, C.T., Ishiguro, H., and Hagita, N. (2011). “Improved acoustic characterization of breathy and whispery voices,” Proceedings of The 12th Annual Conference of the International Speech Communication Association (Interspeech’ 2011), 2965-2968.

69. Heracleous, P., Sato, M., Ishi, C.T., Ishiguro, H., Hagita, N. (2011). “Speech production in noisy environments and the effect on automatic speech recognition,” Proc. ICPhS2011, .

70. Even, J., Heracleous, P., Ishi, C., Hagita, N. (2011) “Range based multi microphone array fusion for speaker activity detection in small meetings,” Proc. Interspeech2011, 2737-2740.

71. Even, J., Heracleous, P., Ishi, C., Hagita, N. (2011) “Multi-modal front-end for speaker activity detection in small meetings,” Proc. IROS2011, 536-541.

72. Ishi, C., Dong, L., Ishiguro, H., and Hagita, N. (2010). “Sound interval detection of multiple sources based on sound directivity,” Proc. of IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2010), 1982-1987.

73. Ishi, C., Sato, M., Lao, S., and Hagita, N. (2010). “Real-time audio-visual voice activity detection for speech recognition in noisy environments,” Proc. International Conference on Auditory-Visual Speech Processing (AVSP2010), 81-84.

74. Heracleous, P., Sato, M., Ishi, C., and Hagita, N. (2010). “Investigating the role of the Lombard reflex in visual and audiovisual speech recognition,” Proc. International Conference on Auditory-Visual Speech Processing (AVSP2010), 69-72.

75. Even, J., Ishi, C., Saruwatari, H., Hagita, N. (2010). “Close speaker cancellation for suppression of non-stationary background noise for hands-free speech interface” Proc. of The 11th Annual Conference of the International Speech Communication Association (Interspeech2010).

76. Ishi, C., Ishiguro, H., and Hagita, N. (2010). “Acoustic, electroglottographic and paralinguistic analyses of “rikimi” in expressive speech,” Proceedings of Speech Prosody 2010 (SP2010), ID 100139, 1-4.

77. Ishi, C.T., Liu, C., Ishiguro, H., and Hagita, N. (2010). “Head motion during dialogue speech and nod timing control in humanoid robots,” Proceedings of 5th ACM/IEEE International Conference on Human-Robot Interaction (HRI 2010), 293-300.

78. Ishi, C.T., Chatot, O., Ishiguro, H., and Hagita, N. (2009). “Evaluation of a MUSIC-based real-time sound localization of multiple sound sources in real noisy environments,” Proceedings of IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2009), 2027-2032.

79. Ishi, C.T., Ishiguro, H., and Hagita, N. (2008). “Analysis of inter- and intra-speaker variability of head motions during spoken dialogue,” Proceedings of the International Conference on Auditory-Visual Speech Processing 2008 (AVSP’ 2008), 37-42.

80. Ishi, C.T., Ishiguro, H., and Hagita, N. (2008). “The meanings of interjections in spontaneous speech,” Proceedings of The 9th Annual Conference of the International Speech Communication Association (Interspeech’ 2008), 1208-1211.

81. Ishi, C.T., Ishiguro, H., and Hagita, N. (2008). “The roles of breathy/whispery voice qualities in dialogue speech,” Proceedings of Speech Prosody 2008, 45-48.

82. Ishi, C.T., Haas, J., Wilbers, F.P., Ishiguro, H., and Hagita, N. (2007). “Analysis of head motions and speech, and head motion control in an android,” Proceedings of IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2007), 548-553.

83. Wilbers, F.P., Ishi, C.T., Ishiguro, H. (2007). “A blendshape model for mapping facial motions to an android,” Proceedings of IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2007), 542-547.

84. Ishi, C.T., Ishiguro, H., and Hagita, N. (2007). “Analysis of head motions and speech in spoken dialogue,” Proceedings of The 8th Annual Conference of the International Speech Communication Association (Interspeech’ 2007), 670-673.

85. Ishi, C.T., Ishiguro, H., and Hagita, N. (2007). “Acoustic analysis of pressed phonation,” Proceedings of International Conference on Phonetic Sciences (ICPhS’2007), 2057-2060.

86. Ishi, C.T., Matsuda, S., Kanda, T., Jitsuhiro, T., Ishiguro, H., Nakamura, S., and Hagita, N. (2006). “Robust speech recognition system for communication robots in real environments,” Proceedings of 2006 IEEE-RAS International Conference on Humanoid Robots (Humanois’06), 340-345.

87. Ishi, C.T., Ishiguro, H., and Hagita, N. (2006). “Evaluation of prosodic and voice quality features on automatic extraction of paralinguistic information,” Proceedings of IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2006), 374-379.

88. Ishi, C.T., Ishiguro, H., and Hagita, N. (2006). “Analysis of prosodic and linguistic cues of phrase finals for turn-taking and dialog acts,” Proceedings of The Ninth International Conference of Speech and Language Processing 2006 (Interspeech’2006 - ICSLP), 2006-2009.

89. Ishi, C.T., Ishiguro, H., and Hagita, N. (2006). “Using Prosodic and Voice Quality Features for Paralinguistic Information Extraction,” CD-ROM Proceedings of The 3rd International Conference on Speech Prosody (SP2006).

90. Ishi, C.T., Ishiguro, H., and Hagita, N. (2005). “Proposal of Acoustic Measures for Automatic Detection of Vocal Fry,” Proceedings of The 9th European Conference on Speech Communication and Technology (Interspeech’ 2005 - Eurospeech), 481-484.

91. Ishi, C.T. (2004). “A New Acoustic Measure for Aspiration Noise Detection,” Proceedings of The 8th International Conference of Speech and Language Processing 2004 (ICSLP 2004), Vol. II, 941-944.

92. Ishi, C.T. (2004). “Analysis of Autocorrelation-based parameters for Creaky Voice Detection,” Proceedings of The 2^nd International Conference on Speech Prosody (SP2004), 643-646.

93. Ishi, C.T., Mokhtari, P., and Campbell, N. (2003). “Perceptually-related acoustic-prosodic features of phrase finals in spontaneous speech,” Proceedings of The 8th European Conference on Speech Communication and Technology (Eurospeech' 03), 405-408.

94. Mokhtari, P., Pfitzinger, H. R. and Ishi, C. T. (2003). “Principal components of glottal waveforms: towards parameterisation and manipulation of laryngeal voice-quality,” Proceedings of the ISCA Tutorial and Research Workshop on "Voice Quality: Functions, Analysis and Synthesis" (Voqual'03), 133-138.

95. Ishi, C.T., Campbell, N. (2002). “Analysis of Acoustic-Prosodic Features of Spontaneous Expressive Speech,” Proceedings of 1st International Congress of Phonetics and Phonology, 19.

96. Ishi, C.T., Hirose, K., Minematsu, N. (2002). “Using Perceptually-related F0- and Power-based Parameters to Identify Accent Types of Accentual Phrases,” Proceedings of 1st International Conference on Speech Prosody (SP2002), 407-410.

97. Ishi, C.T., Minematsu, N., Hirose, K., Nishide R. (2001). “Identification of Accent and Intonation in sentences for CALL systems,” Proceedings of The 7th European Conference on Speech Communication and Technology (Eurospeech'01), 2455-2458.

98. Ishi, C.T., Minematsu, N., Hirose, K. (2001). “Recognition of accent and intonation types of Japanese using F0 parameters related to human pitch perception,” Proceedings of ISCA Tutorial and Research Workshop on Prosody in Speech Recognition and Understanding, 71-76.

99. Ishi, C.T., Minematsu, N., Hirose, K. (2001). “Investigation on perceived pitch and observed F0 features to represent Japanese pitch accent patterns,” Proceedings of International Conference of Speech Processing, 437-442.

100.Ishi, C.T., Hirose, K. & Minematsu, N. (2000). “Identification of Japanese Double-Mora Phonemes Considering Speaking Rate for the Use in CALL Systems,” Proceedings of The 6th International Conference of Speech and Language Processing 2000 (ICSLP 2000), Vol. I, 786-789.

101.Watanabe, M. & Ishi, C.T. (2000). “The distribution of fillers in lectures in the Japanese Language,” Proceedings of The 6th International Conference of Speech and Language Processing 2000 (ICSLP 2000), vol. III, 167-170.

102.Ishi, C.T. & Hirose, K. (2000). “Influence of speaking rate on segmental duration and its formulation for the use in CALL systems,” Proceedings of Integrating Speech Technology In Language Learning 2000 (InSTiL 2000), 106-108.

103.Kawai, G & Ishi, C.T. (1999). “A system for learning the pronunciation of Japanese Pitch Accent,” Proceedings of The 6th European Conference on Speech Communication and Technology (Eurospeech' 99), Vol.1, 177-181.

国内学会・研究会発表論文 (Non-refereed domestic conferences and workshops) （共著の発表論文は一部省略）

1. 王可心、石井カルロス寿憲、林良子「自由会話における「楽しい笑い」と「愛想笑い」の音声的特徴ー予備的分析」、日本音響学会第２５回関西支部若手研究者交流研究発表会、Nov. 2022.（最優秀奨励賞受賞）

2. 石井カルロス寿憲, 港隆史, 佐藤弥, 難波修史, “表情豊かなアンドロイドを用いた感情音声に伴う表情の動的特徴の制御に向けて”, 第３９回日本ロボット学会学術講演会, RSJ2021AC2G1-06, pp. 1-3, Sep. 2021.

3. 湯口彰重, 河野誠也, 石井カルロス寿憲, 吉野幸一郎, 川西康友, 中村泰, 港隆史, 斉藤康己, 美濃導彦, “ぶつくさ君：自身の外界認識と内部状態を言語化するロボット”, 第３９回日本ロボット学会学術講演会, RSJ2021AC1F4-02, pp. 1-4, Sep. 2021.

4. 石井カルロス寿憲, “３者対話における視線の理由と視線逸らしの分析”, 日本音響学会２０２１年秋季研究会, pp. 1281-1282, Sep. 2021.

5. 石井カルロス寿憲, 三方瑠祐, 石黒浩, "雑談対話中の指示ジェスチャの分析：発話機能と対人関係との関連", 日本音響学会2020年春季研究発表会, pp. 823-824, Mar. 2020.

6. 石井カルロス寿憲, 内海章, 長澤勇, "車内搭載マイクロホンアレイによる車内音響アクティビティの分析", 日本音響学会2019年秋季研究発表会, pp. 231-232, Sep. 2019.

7. 石井カルロス寿憲, "対話音声に伴うパラ言語・非言語情報の抽出および表出", 日本音響学会2019年春季研究発表会, pp. 1347-1348, Mar. 2019.

8. 石井カルロス寿憲, 神田崇行, "暴言発話の韻律および声質特徴の分析", 日本音響学会2018年秋季研究発表会, pp. 1119-1120, Sep. 2018.

9. 石井カルロス寿憲, 三方瑠祐, 石黒浩, "対話音声に伴う手振りの分析と分類の検討", 日本音響学会2018年春季研究発表会, 1277-1278, Mar. 2018.

10. 石井カルロス寿憲, 港隆史, 石黒浩, "驚き発話に伴う表情および動作の分析", 日本音響学会2017年春季研究発表会, 343-344, Mar. 2017.

11. 石井カルロス寿憲，Jani Even，萩田紀博. "呼び込み音声の韻律特徴の分析", 日本音響学会2017年春季研究発表会, 315-316, Mar. 2017.

12. 劉超然, 石井カルロス寿憲, 石黒浩, "会話ロボットのための談話機能推定", 日本音響学会2017年春季研究発表会, 153-154, Mar. 2017.

13. 井上昂治, 三村正人, 石井カルロス寿憲, 坂井信輔, 河原達也. "DAEを用いたリアルタイム遠隔音声認識", 日本音響学会2017年春季研究発表会, 99-100, Mar. 2017.

14. 石井カルロス，劉超然，Jani Even (2016) “言音環境知能技術を活用した聴覚支援システムの利用効果における予備的評価”，日本音響学会2016年春季研究発表会, 1469-1470, Mar. 2016.

15. 劉超然，石井カルロス，石黒浩 (2016) “言語・韻律情報を用いた話者交替推定の検討”，日本音響学会2016年春季研究発表会, 3-4, Mar. 2016.

16. 波多野博顕，石井カルロス，石黒浩 (2016) “対話相手の違いに応じた発話スタイルの変化：ジェミノイド対話の分析”，日本音響学会2016年春季研究発表会, 343-344, Mar. 2016.

17. 井上昂治, 三村正人, 石井カルロス寿憲, 河原達也 (2016) “自律型アンドロイドERICA のための遠隔音声認識”，日本音響学会2016年春季研究発表会, 1-2, Mar. 2016.

18. 石井カルロス寿憲, 劉超然, Jani Even (2015) “音環境知能技術を活用した聴覚支援システムのプロトタイプの開発”, 第43回人工知能学会 AI チャレンジ研究会, Nov. 2015.

19. 境くりま, 港隆史, 石井カルロス寿憲, 石黒浩 (2015) “身体的拘束に基づく音声駆動体幹動作生成システム”, 第43回人工知能学会 AI チャレンジ研究会, Nov. 2015.

20. 石井カルロス寿憲, 港隆史, 石黒浩 (2015) “笑い声に伴うアンドロイドロボットの動作生成の検討”, 第３３回日本ロボット学会学術講演会.

21. 石井カルロス寿憲, 波多野博顕, 石黒浩 (2015) “笑いの種類と笑いに伴う表情および動作の分析”, 日本音響学会2015年秋季研究発表会.

22. 波多野博顕, 石井カルロス寿憲, 石黒浩 (2015) “相槌の「はい」における丁寧度と音響特徴の関係について”, 日本音響学会2015年秋季研究発表会.

23. 石井カルロス寿憲、Jani Even、萩田紀博 (2015) “音環境知能を利用した家庭内音の識別”，日本音響学会2015年春季研究発表会、Mar. 2015.

24. 波多野博顕, 石井カルロス寿憲, 多胡夏純 (2015) “発話指向性に応じた韻律スタイルの分析－小学校教師の教室発話を対象に－”，日本音響学会2015年春季研究発表会、Mar. 2015.

25. 劉超然，石井カルロス，石黒浩，萩田紀博 (2014) “臨場感の伝わる遠隔操作システムのデザイン：マイクロフォンアレイ処理を用いた音環境の再構築", 第41回人工知能学会 AI チャレンジ研究会, Nov. 2014.

26. 波多野博顕, 石井カルロス寿憲 (2014) “自然対話音声における感動詞先行型質問発話の韻律, 日本音響学会2014年秋季研究発表会.

27. 石井カルロス寿憲、Jani Even、萩田紀博 (2014) “複数のマイクロホンアレイと人位置情報を組み合わせた音声アクティビティの記録システムの改善”，第３２回日本ロボット学会学術講演会.

28. 渡辺敦志, Jani Even, Luis Yoichi Morales, 石井Carlos 寿憲 (2014) “人間協調型移動ロボットによるコンクリート打音検査記録システム”, 第３２回日本ロボット学会学術講演会.

29. 境くりま, 石井カルロス寿憲, 港隆史, 石黒浩 (2014) “発話者の音声に対応する動作生成と遠隔操作ロボットへの動作の付加効果”, 人工知能学会研究会第３９回ＡＩチャレンジ研究会 (SIG-Challenge-B303), 7-13, Mar. 2014.

30. 石井カルロス寿憲，波多野博顕，萩田紀博, (2014) "小学校理科室における笑いイベントの分析", 日本音響学会2014年春季研究発表会, 263-264.

31. 石井カルロス寿憲, Jani Even, 塩見昌裕, 萩田紀博, (2013) "複数のマイクロホンアレイを用いた理科室における音源アクティビティの分析", 人工知能学会研究会第３８回ＡＩチャレンジ研究会 (SIG-Challenge-B302), 28-33.

32. 石井カルロス寿憲, Jani Even, 塩見昌裕, 小泉智史, 萩田紀博, (2013) "複数のマイクロホンアレイによる音源アクティビティ：小学校理科室におけるデータ分析", 第３１回日本ロボット学会学術講演会, RSJ2013AC2D2-01.

33. 石井カルロス寿憲，波多野博顕，萩田紀博,(2013) "「うんうん」と「うーん」の識別における音響特徴の分析", 日本音響学会2013年秋季研究発表会, 265-266.

34. 石井カルロス寿憲、Jani Even、萩田紀博 (2013) “反射音を利用した音源定位と音源の指向性の分析”，日本音響学会2013年春季研究発表会、Mar. 2013、887-888.

35. 波多野博顕、新井潤、石井カルロス寿憲 (2013) “自然対話における質問音調の選択に関わる要因の分析”日本音響学会2013年春季研究発表会, Mar. 2013, 429-430．

36. 石井カルロス寿憲、Jani Even、萩田紀博 (2012) “複数のマイクロホンアレイおよび空間情報と反射音を利用した音源定位の検討”，人工知能学会AIチャレンジ研究会、Nov. 2012, 64-69.

37. 石井カルロス寿憲、石黒浩、萩田紀博 (2012) “自然対話音声におけるパラ言語情報の伝達に関連するラベルの種類”日本音響学会2012年秋季研究発表会, Sep. 2012, 267-268．

38. 波多野博顕、新井潤、石井カルロス寿憲 (2012) “自然対話音声を対象にした発話行為ラベルの構築にむけて”日本音響学会2012年秋季研究発表会, Sep. 2012, 265-266．

39. 石井カルロス寿憲、劉超然、石黒浩、萩田紀博 (2012) “フォルマントによる口唇動作生成の試み”日本音響学会2012年春季研究発表会, Mar. 2012, 373-374．

40. 石井カルロス寿憲、石黒浩、萩田紀博 (2011) “マイクロホンアレイによる実時間3次元空間での音源定位（デモ）”人工知能学会ＡＩチャレンジ研究会、Dez. 2011

41. 石井カルロス寿憲、劉超然、石黒浩、萩田紀博 (2011) “Tele-operating the lip motion of humanoid robots from the operator’s voice” 第29回日本ロボット学会学術講演会, Sep. 2011, RSJ2011AC1J3-6.

42. 高橋徹、中臺一博、石井Carlos 寿憲、Jani Even、奥乃博、“実環境下での音源定位・音源検出の検討”第29回日本ロボット学会学術講演会、RSJ2011AC1F3-3.

43. 石井カルロス寿憲、新井潤、萩田紀博 (2011) “自然対話音声における発話内の声質の変化の分析”日本音響学会2011年秋季研究発表会, 269-270.

44. 石井カルロス寿憲、石黒浩、萩田紀博 (2011)“気息音発声の音響的表現の改善”日本音響学会2011年春季研究発表会, 269-270.

45. 石井カルロス寿憲、梁棟、石黒浩、萩田紀博 (2010) “ロボットの実環境におけるピッチ抽出に関する考察” 人工知能学会ＡＩチャレンジ研究会 (SIG-Challenge-10), 36-40.

46. 石井カルロス寿憲、佐藤幹、秋本高明、萩田紀博 (2010) “コミュニケーション知能における音声認識モジュール群に関する一考察”　日本ロボット学会学術講演会、ID RSJ2010AC3P3-6.

47. 石井カルロス寿憲、新井潤、萩田紀博 (2010)“対話音声に出現する感動詞における発話意図認識の試み”　日本音響学会2010年秋季研究発表会, Vol. I, 251-252.

48. 石井カルロス寿憲，梁棟，石黒浩，萩田紀博 (2010) “音の指向性を利用した複数音源の発話区間検出の検討” 日本音響学会2010年春季研究発表会, Vol. I, 731-734.

49. 石井カルロス寿憲，梁棟，石黒浩，萩田紀博 (2009) “MUSIC空間スペクトログラムを用いた複数音源の発話区間検出の検討”第30回人工知能学会 AIチャレンジ研究会 (SIG-Challenge-09), 8-13.

50. 石井カルロス寿憲，石黒浩，萩田紀博 (2009) “声質の変化がもたらすパラ言語情報の分析” 日本音響学会2009年秋季研究発表会, Vol. I, 475-476.

51. 石井カルロス寿憲，石黒浩，萩田紀博 (2009) “声質に関連する音響パラメータの分析” 日本音響学会2009年秋季研究発表会, Vol. I, 327-328.

52. 石井カルロス寿憲，Olivier Chatot，石黒浩，萩田紀博 (2009) “実環境におけるMUSIC法を用いた3次元音源定位の評価” 第28回人工知能学会 AIチャレンジ研究会 (SIG-Challenge-08).

53. 石井カルロス寿憲，Olivier Chatot，石黒浩，萩田紀博 (2009) “3次元空間での音源方向推定の実環境における評価およびリアルタイム性の評価” 日本音響学会2009年春季研究発表会, Vol. I, 699-702.

54. 石井カルロス寿憲，石黒浩，萩田紀博 (2008) “自然発話に現れる感動詞の発話スタイルと機能の分析” 日本音響学会2008年秋季研究発表会, Vol. I, 269-270.

55. 石井カルロス寿憲，石黒浩，萩田紀博 (2008) “Breathy/whispery発声の音響特徴と音声コミュニケーションにおける役割” 電子情報通信学会技術研究報告, Vol. 108, No. 116, 127-132.

56. 石井カルロス寿憲，石黒浩，萩田紀博 (2008) “Breathy/whispery発声の音声コミュニケーションにおける役割” 日本音響学会2008年春季研究発表会, Vol. I, 357-358.

57. 石井カルロス寿憲，石黒浩，萩田紀博 (2007) “発話音声に関わる頭部動作の分析及びアンドロイドロボットの頭部制御” 第26回人工知能学会 AIチャレンジ研究会 (SIG-Challenge-07), 46-51.

58. 石井カルロス寿憲，石黒浩，萩田紀博 (2007) “発話音声に伴う頭部動作の分析” 日本音響学会2007年秋季研究発表会, Vol. I, 109-110.

59. 石井カルロス寿憲，石黒浩，萩田紀博 (2007) “ＥＧＧを用いた「りきみ」発声の音響分析” 日本音響学会2007年春季研究発表会, Vol. I, 221-222.

60. Ishi. C.T., Ishiguro, H., Hagita, N. (2006) “Acoustic analysis of pressed voice,” Fourth Joint Meeting: ASA and ASJ, J. Acoust,. Soc, Am., Vol. 120, No. 5, Pt. 2, pp. 3374, Nov. 2006.

61. 石井カルロス寿憲，松田繁樹，神田崇行，實廣貴敏，石黒浩，中村哲，萩田紀博 (2006)“コミュニケーションロボットの音声認識システムの実環境における評価”第24回人口知能学会ＡＩチャレンジ研究会 (SIG-Challenge-06), 23-28.

62. 石井カルロス寿憲，石黒浩，萩田紀博 (2006)“りきみの自動検出のための音響分析” 電子情報通信学会技術研究報告，Vol. 106，No. 178，1-6.

63. 石井カルロス寿憲，石黒浩，萩田紀博 (2006)“喉頭を力んだ発声の音響特徴の分析” 日本音響学会2006年春季研究発表会，Vol. I，227-228.

64. 石井カルロス寿憲，石黒浩，萩田紀博 (2005)“対話音声における韻律と声質の特徴を利用したパラ言語情報の抽出の検討”第22回人口知能学会ＡＩチャレンジ研究会（SIG-Challenge-05），71-76.

65. 石井カルロス寿憲，石黒浩，萩田紀博 (2005)“韻律と声質に関連する音響パラメータを用いたパラ言語情報の抽出の検討”日本音響学会2005年秋季研究発表会，233-234.

66. 石井カルロス寿憲 (2004)“母音区間の息漏れに関連する音響パラメータの検討” 日本音響学会2004年秋季研究発表会，Vol. I，295-296.

67. 石井カルロス寿憲，ニック・キャンベル (2004)“句末の機能的役割”日本音響学会2004年春季研究発表会，Vol. I，235-236.

68. Mokhtari, P., Pfitzinger, H. R., Ishi, C. T. and Campbell, N. (2004). "Laryngeal voice quality conversion by glottal waveshape PCA", in Proceedings of the Spring2004 Meeting of the Acoustical Society of Japan, Atsugi, Japan, Paper 2-P-6, pp.341-342.

69. 石井カルロス寿憲 (2003)“Creaky発声の音響的特徴の分析” 日本音響学会2003年秋季研究発表会，Vol. I，235-236.

70. Ishi, C.T., Campbell, N. (2003). “Acoustic-Prosodic Analysis of Phrase Finals in Expressive Speech,” Proceedings of The 1^st JST/CREST International Workshop on Expressive Speech Processing, 85-88.

71. 石井カルロス寿憲，ニック・キャンベル (2003)“日常会話における句末の音響･韻律的特徴の分析”日本音響学会2003年春季研究発表会，Vol. I，311-312.

72. 石井カルロス寿憲，ニック・キャンベル (2002)“表現豊かな発話様式の韻律的特徴の分析”日本音響学会2002年秋季研究発表会，Vol. I，275-276.

73. Ishi, C.T., Hirose, K., Minematsu, N. (2002). “Investigations on a quantified representation of pitch movements in syllable units,” Proceedings of The 2002 Spring Meeting of the Acoustical Society of Japan, Vol. I, 419-420.

74. 石井カルロス寿憲，峯松信明，広瀬啓吉 (2001)“ピッチ知覚に対応したモーラピッチの自動抽出”日本音声学会全国大会予稿集，13-18.

75. 石井カルロス寿憲，峯松信明，広瀬啓吉 (2001)“日本語のアクセント･イントネーションにおけるピッチ知覚と対応したモーラピッチの自動抽出”日本音響学会2001年秋季研究発表会，Vol. I，445-446.

76. 石井カルロス寿憲，峯松信明，広瀬啓吉 (2001)“ピッチ知覚を考慮した日本語連続音声のアクセント型判定” 電子情報通信学会技術研究報告，Vol. 101，No. 270，23-30.

77. Ishi, C.T., Minematsu, N., Hirose, K. (2001). “Relationship between acoustically observed F0 and perceived pitch for Japanese accent and intonation,” Technical Report of Institute of Electronics, Information and Communication Engineers, SP2001-41, 17-22.

78. 石井カルロス寿憲，広瀬啓吉，峯松信明 (2001)“発音教育システムにおけるイントネーションの自動分類”日本音響学会2001年春季研究発表会，Vol. I，327-328.

79. 西出隆二，石井カルロス寿憲，峯松信明，広瀬啓吉 (2001)“日本語のアクセントを対象とした発音教育システム構築に関する検討”日本音響学会研究発表会2001年春季，Vol. I，269-270.

80. 石井カルロス寿憲，西出隆二，峯松信明，広瀬啓吉 (2001)“日本語のアクセント･イントネーションを対象とした発音教育システム構築に関する検討” 電子情報通信学会技術研究報告，Vol. 100，No. 594，33-40.

81. 石井カルロス寿憲，広瀬啓吉，峯松信明 (2000)“等時性の観点からの日本語のモーラタイミングに関する考察”日本音響学会2000年秋季研究発表会，Vol. I，199-200.

82. 石井カルロス寿憲，藤本克彦，広瀬啓吉 (2000)“話速を考慮した日本語の特殊拍判別”電子情報通信学会技術研究報告，Vol. 100，No. 97，17-24.

83. 石井カルロス寿憲，広瀬啓吉，峯松信明 (2000)“話速に伴う音の持続時間変化の分析”日本音響学会2000年春季研究発表会，Vol. I，235-236.

84. 石井カルロス寿憲，河合剛，広瀬啓吉 (1999)“日本語単語のピッチアクセント型の発音学習システム”日本音響学会1999年春季研究発表会，Vol. I，245-246.