"Science is an error-correcting process." β Charles S. Peirce
I'm a doctoral researcher at Tampere University, specializing in machine learning for audio understanding. I'm passionate about teaching machines to hear, interpret, and respond to sound like humans do.
Before entering research, I spent over 7 years as a software engineer, giving me a strong foundation in building scalable systems and solving real-world problems. I now work at the intersection of ML research, software engineering, and AI-driven audio applications, combining scientific depth with hands-on development skills.
- π§ Machine Learning for Audio Understanding (classification, detection, retrieval, generation)
- π Self-Supervised Representation Learning
- π Multimodal Learning (audio + text/vision)
- π§© Low-Resource Learning (zero-shot, few-shot)
- π» Programming: Python, Java, Scala, JavaScript, SQL, C/C++, R, Matlab, LaTeX, GDScript
- βοΈ Machine Learning: PyTorch, TensorFlow, scikit-learn, Ray Tune, MLflow, Spark
- π£οΈ Audio & NLP: librosa, torchaudio, NLTK
- π Data Analysis: NumPy, SciPy, Pandas, Jupyter, Matplotlib
- π Web, Backend & Architectures: Java EE, Spring, Hibernate, Django, Flask, RESTful APIs, Microservices, Event-Driven Architectures
- π± GUI & Game Development: PySide6, Godot Engine
- π’οΈ Databases & DevOps: MySQL, PostgreSQL, Linux, Docker, Git
- βοΈ Concurrency & Systems: Multi-threaded Programming, HPC
- π Multimodal Audio-Text Retrieval System β Developing models that match audio clips with text queries using multimodal learning.
- π€ X-GoBot β π§[WIP] Developing a voice-enabled desktop AI assistant with local processing and contextual awareness.
I'm always open to conversations about audio ML, applied AI, or building smart, sound-aware systems. Whether you're in research, industry, or tinkering with side projects β feel free to reach out!
π« Email: huang.xie@outlook.com π LinkedIn: linkedin.com/in/huang-xie-28b7872bb