Songyao Jiang

I am a PhD candidate at Northeastern University, where I work on computer vision and machine learning in SmileLab advised by Dr. Yun (Raymond) Fu.

I am early member of an AI beauty startup company Giaran, Inc., which was acquired by Shiseido Americas in Nov. 2017 (Here's the News).

I received my masters degree at the University of Michigan and my bachelors at The Hong Kong Polytechnic University.

I am also a skilled astronomy and landscape photographer, and here is my Little Gallery

Email  /  Google Scholar  /  GitHub  /  LinkedIn


I'm interested in computer vision, machine learning, image processing, and computational photography. Much of my research is about human faces and gestures. My current research includes whole-body pose estimation and skeleton-based action recognition.

Skeleton Aware Multi-modal Sign Language Recognition (SAM-SLR)
Songyao Jiang, Bin Sun, Lichen Wang, Yue Bai, Kunpeng Li, and Yun Fu
CVPR21 Challenge on Large Scale Signer Independent Isolated SLR, 2021
Paper / GitHub / Ranked 1st in both RGB / RGBD Tracks

We attended CVPR 2021 Challenge on Large Scale Signer Independent Isolated SLR. We proposed a skeleton-aware multi-modal sign language recognition framework (SAM-SLR) to capture information from multiple modalities and assemble them together to further boost the performance. Our team ranked 1st in both RGB and RGB-D tracks in the challenge. Our workshop paper has been accepted to the corresponding workshop at CVPR 2021.

Geometrically Editable Face Image Translation With Adversarial Networks
Songyao Jiang, Zhiqiang Tao, and Yun Fu
IEEE Transactions on Image Processing (TIP), 2021
Paper / GitHub

Existing image-to-image translation methods mainly use a deep generative model that they focus on exploring the bi-directional or multi-directional relationship between specific domains categorized by attribute-level or class-level labels. As a result, existing methods are incapable of editing geometric contents during translation. To address these challenges, we formulate the image translation problem as multi-domain mappings in both geometric and attribute directions and propose a novel Geometrically Editable Generative Adversarial Networks (GEGAN) model to learn such mappings of geometric editable translations.

SuperFront: From Low-resolution to High-resolution Frontal Face Synthesis
Yu Yin, Joseph P. Robinson, Songyao Jiang, Yue Bai, Can Qin and Yun Fu
Preprint, 2021

Advances in face rotation, along with other face-based generative tasks, are more frequent as we advance further in topics of deep learning. Even as impressive milestones are achieved in synthesizing faces, the importance of preserving identity is needed in practice and should not be overlooked. Also, the difficulty should not be more for data with obscured faces, heavier poses, and lower quality. Existing methods tend to focus on samples with variation in pose, but with the assumption data is high in quality. We propose a generative adversarial network (GAN) -based model to generate high-quality, identity preserving frontal faces from one or multiple low-resolution (LR) faces with extreme poses.

Dual-Attention GAN for Large-Pose Face Frontalization
Yu Yin, Songyao Jiang, Joseph P. Robinson, and Yun Fu
IEEE International Conference on Automatic Face & Gesture Recognition (FG), 2020
Paper / GitHub

Face frontalization provides an effective and efficient way for face data augmentation and further improves the face recognition performance in extreme pose scenario. Despite recent advances in deep learning-based face synthesis approaches, this problem is still challenging due to significant pose and illumination discrepancy. In this paper, we present a novel Dual-Attention Generative Adversarial Network (DA-GAN) for photo-realistic face frontalization by capturing both contextual dependencies and local consistency during GAN training.

Machine Learning-aided Quantification of Antibody-based Cancer Immunotherapy by Natural Killer Cells in Microfluidic Droplets
Saheli Sarkar, Wenjing Kang, Songyao Jiang, Kunpeng Li, Somak Ray, Ed Luther, Alexander R Ivanov, Yun Fu, and Tania Konry
Lab on a Chip, 2020

Natural killer (NK) cells have emerged as an effective alternative option to T cell-based immunotherapies, particularly against liquid (hematologic) tumors. However, the effectiveness of NK cell therapy has been less than optimal for solid tumors, partly due to the heterogeneity in target interaction leading to variable anti-tumor cytotoxicity. This paper describes a microfluidic droplet-based cytotoxicity assay for quantitative comparison of immunotherapeutic NK-92 cell interaction with various types of target cells. Machine learning algorithms were developed to assess the dynamics of individual effector-target cell pair conjugation and target death in droplets in a semi-automated manner.

Video-based Multi-person Pose Estimation and Tracking
Songyao Jiang, and Yun Fu
Patent Application, 2020

Our invention aims to tackle the video-based multi-person pose estimation problem using a deep learning framework with multi-frame refinement and optimization. In particular, our method inherently tracks the estimated poses and makes the model insensitive to occlusions. Moreover, we introduce a backward reconstruction loop and temporal consistency to the objective function which mitigates the inter-frame inconsistency and significantly reduces the shaking and vibrations phenomenon of the estimated pose skeletons in video pose estimation.

Face Recognition and Verification in Low-light Condition
Songyao Jiang, Yue Wu, Zhengming Ding, and Yun Fu

This project tends to solve the problem of recognizing people in low light condition, which is quite useful in security. In low-light condition, we usually utilize near IR, mid-range IR and long-range IR to obtain the portrait images of the target person. However, those IR images are very different than the visible images that we used to train our deep face recognition and verification methods. To utilize the knowledge we learned from visible images and apply it on IR images, we developed a semi-supervised and an unsupervised transfer learning methods to transfer the knowledge we learned from visible spectrum to IR spectrum. Based on which, we developed our low-light face recognition and verification system.

Spatially Constrained Generative Adversarial Networks for Conditional Image Generation
Songyao Jiang, Hongfu Liu, Yue Wu and Yun Fu
Paper / GitHub

Image generation has raised tremendous attention in both academic and industrial areas, especially for criminal portrait and fashion design. The current studies always focus on class labels as the condition where spatial contents are randomly generated. The edge details and spatial information is usually blurred and difficult to preserve. In light of this, we propose a novel Spatially Constrained Generative Adversarial Network , which decouples the spatial constraints from the latent vector and makes them feasible as additional controllable signals. Experimentally, we provide both visual and quantitive results, and demonstrate that the proposed SCGAN is very effective in controlling the spatial contents as well as generating high-quality images.

Segmentation Guided Image-to-Image Translation with Adversarial Networks
Songyao Jiang, Zhiqiang Tao and Yun Fu
IEEE International Conference on Automatic Face & Gesture Recognition (FG), 2019
Paper / GitHub / ArXiv

Recently image-to-image translation methods neglect to utilize higher-level and instance-specific information to guide the training process, leading to a great deal of unrealistic generated images of low quality. Existing methods also lack of spatial controllability during translation. To address these challenge, we propose a novel Segmentation Guided Generative Adversarial Networks, which leverages semantic segmentation to further boost the generation performance and provide spatial mapping. Experimental results on multi-domain face image translation task empirically demonstrate our ability of the spatial modification and our superiority in image quality over several state-of-the-art methods.

Rule-Based Facial Makeup Recommendation System
Taleb Alashkar, Songyao Jiang and Yun Fu
IEEE International Conference on Automatic Face & Gesture Recognition (FG), 2017
Paper / GitHub

Facial makeup style plays a key role in the facial appearance making it more beautiful and attractive. Choosing the best makeup style for a certain face to fit a certain occasion is a full art. To solve this problem computationally, an automatic and smart facial makeup recommendation and synthesis system is proposed in this paper. Additionally, an automatic facial makeup synthesis system is developed to apply the recommended style on the facial image as well. To this end, a new dataset with 961 different females photos collected and labeled.

Examples-Rules Guided Deep Neural Network for Makeup Recommendation
Taleb Alashkar, Songyao Jiang, Shuyang Wang and Yun Fu
AAAI Conference on Artificial Intelligence (AAAI), 2017
Paper / GitHub

We consider a fully automatic makeup recommendation system and propose a novel examples-rules guided deep neural network approach. The framework consists of three stages. First, makeup-related facial traits are classified into structured coding. Second, these facial traits are fed in- to examples-rules guided deep neural recommendation model which makes use of the pairwise of Before-After images and the makeup artist knowledge jointly. Finally, to visualize the recommended makeup style, an automatic makeup synthesis system is developed as well.

This website is generated using source code from Jon Barron.