Muhammad Umar Farooq

I have been working as an Applied Research Scientist at Emotech Ltd. since July 2024.

In July 2024, I graduated with a PhD from the Department of Computer Science, University of Sheffield (UK) which I joined in 2021. During my PhD, my research focus has been low-resource and multilingual speech recognition at the LivePerson Centre for Speech and Language Technologies (SLT) under the supervision of Prof. Thomas Hain. Being part of the LivePerson Centre, I have also been associated with Speech and Hearing (SPANDH), and Machine Intelligence for Natural Interfaces (MINI) research groups. My PhD at the University of Sheffield has been fun with some amazing colleagues and in a healthy research environment. I not only enjoyed my core research work, but I had the chance to collaborate with my colleagues on other projects too. Our publications are on the Publications page.

Alongside my PhD, I also developed a short course on ‘Unsupervised Machine Learning’ for non-experts as a ‘Training Assiatnt’ at the University of Sheffield. I also delivered sessions on Python programming to non-experts (faculty members and PhD students from social science departments). I also worked as a Graduate Teaching Assistant (GTA) for various modules at the University of Sheffield. I assisted module leaders for multiple modules including ‘Machine Learning and Adaptive Intelligence’ and ‘Data Science with Python’. Please visit the Teaching page for details.

Before enrolling for my Ph.D., I have worked as a pre-doc research intern at Institute of Formal and Applied Linguistics (UFAL), Charles University, Prague. I primarily worked on speech-related components of the EU Horizon 2020 project European Live Translator (ELITR) supervised by Dr. Ondrej Bojar.

Prior to that, I worked as a speech processing Research Officer at Center for Language Engineering (CLE), University of Engineering and Technology (UET), Lahore. I joined CLE during my senior year as a speech scientist to develop low-resource language technologies under the supervision of Dr. Sarmad Hussain. I received my M.Sc. and B.Sc. Electrical Engineering degrees from UET, Lahore in 2019 and 2017 respectively.

Selected Publications

Progressive Unsupervised Domain Adaptation for ASR Using Ensemble Models and Multi-stage Training

Rehan Ahmad, Muhammad Umar Farooq, and Thomas Hain

In IEEE ICASSP, 2024

Bib

@inproceedings{rehan24_icassp,
  author = {Ahmad, Rehan and Farooq, Muhammad Umar and Hain, Thomas},
  title = {{Progressive Unsupervised Domain Adaptation for ASR Using Ensemble Models and Multi-stage Training}},
  year = {2024},
  booktitle = {IEEE ICASSP},
}

MUST: A Multilingual Student-Teacher Learning Approach for Low-Resource Speech Recognition

Muhammad Umar Farooq, Rehan Ahmad, and Thomas Hain

In IEEE ASRU, 2023

Bib PDF

@inproceedings{farooq23_asru,
  author = {Farooq, Muhammad Umar and Ahmad, Rehan and Hain, Thomas},
  title = {{MUST: A Multilingual Student-Teacher Learning Approach for Low-Resource Speech Recognition}},
  year = {2023},
  booktitle = {IEEE ASRU},
}

Learning Cross-lingual Mappings for Data Augmentation to Improve Low-Resource Speech Recognition

Muhammad Umar Farooq, and Thomas Hain

In Proc. INTERSPEECH, 2023

Bib

@inproceedings{farooq23_interspeech,
  author = {Farooq, Muhammad Umar and Hain, Thomas},
  title = {{Learning Cross-lingual Mappings for Data Augmentation to Improve Low-Resource Speech Recognition}},
  year = {2023},
  booktitle = {Proc. INTERSPEECH},
  pages = {5072--5076},
  doi = {10.21437/Interspeech.2023-1613},
}

Towards Domain Generalisation in ASR with Elitist Sampling and Ensemble Knowledge Distillation

Rehan Ahmad, Md Asif Jalal, Muhammad Umar Farooq, and 2 more authors

In IEEE ICASSP, 2023

Bib

@inproceedings{10095746,
  author = {Ahmad, Rehan and Jalal, Md Asif and Umar Farooq, Muhammad and Ollerenshaw, Anna and Hain, Thomas},
  booktitle = {IEEE ICASSP},
  title = {Towards Domain Generalisation in ASR with Elitist Sampling and Ensemble Knowledge Distillation},
  year = {2023},
  volume = {},
  number = {},
  pages = {1-5},
  doi = {10.1109/ICASSP49357.2023.10095746},
}

Investigating the Impact of Crosslingual Acoustic-Phonetic Similarities on Multilingual Speech Recognition

Muhammad Umar Farooq, and Thomas Hain

In Proc. Interspeech, 2022

Bib

@inproceedings{farooq22_interspeech,
  author = {Farooq, Muhammad Umar and Hain, Thomas},
  title = {{Investigating the Impact of Crosslingual Acoustic-Phonetic Similarities on Multilingual Speech Recognition}},
  year = {2022},
  booktitle = {Proc. Interspeech},
  pages = {3849--3853},
  doi = {10.21437/Interspeech.2022-10916},
}

Non-Linear Pairwise Language Mappings for Low-Resource Multilingual Acoustic Model Fusion

Muhammad Umar Farooq, Darshan Adiga Haniya Narayana, and Thomas Hain

In Proc. Interspeech, 2022

Bib

@inproceedings{farooq22b_interspeech,
  author = {Farooq, Muhammad Umar and Narayana, Darshan Adiga Haniya and Hain, Thomas},
  title = {{Non-Linear Pairwise Language Mappings for Low-Resource Multilingual Acoustic Model Fusion}},
  year = {2022},
  booktitle = {Proc. Interspeech},
  pages = {4850--4854},
  doi = {10.21437/Interspeech.2022-11449},
}

Improving Large Vocabulary Urdu Speech Recognition System Using Deep Neural Networks

Muhammad Umar Farooq, Farah Adeeba, Sahar Rauf, and 1 more author

In Proc. Interspeech 2019, 2019

Bib

@inproceedings{farooq19_interspeech,
  author = {Farooq, Muhammad Umar and Adeeba, Farah and Rauf, Sahar and Hussain, Sarmad},
  title = {{Improving Large Vocabulary Urdu Speech Recognition System Using Deep Neural Networks}},
  year = {2019},
  booktitle = {Proc. Interspeech 2019},
  pages = {2978--2982},
  doi = {10.21437/Interspeech.2019-2629},
}