I received my PhD in Computer Science from Arizona State University in April 2013.  My PhD advisor is  Dr. Sethuraman Panchanathan (Panch) . My research interests include computer vision, machine learning and assistive technology. 


1. Active Learning for Pattern Classification

The rapid escalation of technology and the widespread emergence of modern technological equipments have resulted in the generation of large quantities of digital data. This has expanded the possibilities of solving real world problems using computational learning frameworks. However, while gathering large quantities of unlabeled data is cheap and easy, annotating them with class labels entails significant human labor. The objective of this project is to develop a batch mode active learning scheme to select batches of informative samples (for manual annotation) from large quantities of unlabeled data. This tremendously reduces the labeling cost and also exposes the learner to the exemplar instances from the unlabeled set.  Our current application is in the domain of image recognition. Due to the high frame rate of modern video cameras, the captured images have a high redundancy. Thus, active learning is of paramount importance in identifying the promising instances from such a superfluous set.

Supervisor: Dr. Sethuraman Panchanathan , Department of Computer Science and Engineering, Arizona State University.

2. Video Summarization

The advent of inexpensive modern video cameras together with several social networking and video sharing websites (like YouTube, Flickr, Facebook) has led to an unprecedented increase in the generation of digital video data in the modern era. To cope up with this humongous volume of data, video summarization algorithms have been developed, which automatically identify the exemplar samples from large amounts of redundant video data. The objective of this project is to develop novel algorithms to address three different facets of video summarization - (i) adaptive, where the summary length (number of summary frames) needs to be decided adaptively, based on the complexity of the video stream in question; (ii) distributed, where the entire video is not present in a single machine, but is distributed across multiple sites and (iii) online, where the summary needs to be generated on-the-fly, as the video frames arrive sequentially over time.

Supervisor: Dr. Ravishankar Iyer, Intel Labs

3. Deep Learning for Multi-modal Emotion Recognition 

Deep learning algorithms learn a highly discriminating set of features for a given machine learning task and have depicted commendable performance in a variety of applications. One of its major successes has been in computer vision, where it has achieved state-of-the-art performance in object recognition, image segmentation and activity recognition among others. As a part of this project, we recently created the emoFBVP dataset containing video recordings of 10 actors enacting 23 different emotions, together with their facial expressions, body mannerisms, vocal cues and physiological signals. We are exploring the usage of deep neural networks to recognize emotions from multi-modal cues. We are also studying the effect of transfer of emotion-rich features between a source and a target network in terms of classification accuracy and training time in a multi-modal setting. 

SupervisorDr. Sethuraman Panchanathan , Department of Computer Science and Engineering, Arizona State University. 

4. Wearable Social Interaction Assistant for the Blind

Effective social interaction is an essential ingredient of a healthy living. Study has shown that about 65% of the information during social interaction is conveyed through non-verbal cues (like eye-gaze, body mannerisms etc) and therefore individuals who are visually challenged face a fundamental limitation in their everyday interactions.  The goal of this project is to provide blind individuals with information that can enrich their interaction with their sighted counterparts.  The system consists of a camera mounted on the nose-bridge of a pair of glasses, which the user wears. The incoming video stream is analyzed and the extracted information is delivered to the user, to enable him better understand and interpret the surroundings.  This project involves addressing several challenging computer vision problems like face recognition, emotion recognition and vision-based object counting among others. 

 Supervisor: Dr. Sethuraman Panchanathan , Department of Computer Science and Engineering, Arizona State University.

5. Smart Stadium for Smart Living

Rapid urbanization has led to more people residing in cities than ever before in human history, and projections estimate that 64% of the global population will be urban by 2050. Cities are beginning to explore Smart City initiatives to reduce expenses and complexities while increasing efficiency and quality of life for its citizens. To achieve this goal, advances in technology and policies are needed together with rethinking traditional solutions to transportation, safety, sustainability, security, citizen engagement, among other priority areas. This project uses the Smart Stadium as a ‘living laboratory’ to identify, deploy and test Internet of Things technologies and Smart City solutions in an environment small enough to practically trial but large enough to evaluate effectiveness and scalability. The Smart Stadium for Smart Living initiative brings together Arizona State University, Dublin City University, Intel Corporation, Gaelic Athletic Association, Sun Devil Stadium and Croke Park to explore smart environment solutions. The objective of this project is to enrich the fan experience by providing access to wait times at restrooms and concession stands via a mobile app. Such a technology will allow fans to spend maximal time watching and enjoying a game rather than waiting in long lines during the course of a game. We adopt a computer vision based approach to count the number of people in a queue. We assume the presence of cameras in strategic locations in the vicinity of restrooms and concession stands; the video feed from these cameras is analyzed to accurately estimate the count of people in the queues. Once the count is obtained, wait times can be computed from the average service time per person.

SupervisorDr. Sethuraman Panchanathan , Department of Computer Science and Engineering, Arizona State University. 


1. Rear Camera Object Detection and Pedestrian Path Prediction for Autonomous Driving

In the U.S. a staggering number of kids are injured every year due to back-over accidents (drivers not noticing them while reversing their vehicles). The goal of this project was to use the video footage from a camera installed on the back of a vehicle to detect the presence of objects (specially kids) and to trigger an alarm in case of a dangerous situation. Such a technology can be immensely useful to reduce child casualties. As a part of this project, we had also worked on the problem of pedestrian path prediction. Given the location of a pedestrian in an image, the objective was to predict his position at short time intervals (1 or 2 seconds) into the future. This can be tremendously useful to activate and trigger emergency braking in case of a potentially dangerous situation. 

Supervisor: Dr. Vijayakumar Bhagavatula,  Department of Electrical and Computer Engineering, Carnegie Mellon University. 

2. The Conformal Predictions Framework

The Conformal Predictions (CP) framework is based on the theories of algorithmic randomness, transductive inference and hypothesis testing.  The framework enables the system to compute a measure of confidence, depicting the system's level of certainty, while predicting the class label of a given test point in an online learning scenario. This adds value to the final prediction and can be used for both classification and regression problems.  A useful property of the framework is the calibration of errors in an online setting, which means that the number of mis-classifications made by the system is exactly controlled by a threshold set by the user.  Current work in this project is focused on extending the framework to online active learning and to learn from multiple sources of information. Ongoing work also includes learning a kernel function to increase the efficiency of the framework. 

SupervisorDr. Sethuraman Panchanathan , Department of Computer Science and Engineering, Arizona State University. 

3. Person Recognition using Multi-modal Biometrics (Face and Speech)

Automated verification of human identity is indispensable in security and surveillance applications. Uni-modal systems relying on a single modality for authentication suffer from several limitations. Multi-modal systems consolidate evidence from multiple sources and are thus more reliable.  The main objective of this project is to develop a robust recognition engine based on the face and speech modalities. Biometric authentication schemes based on audio and video modalities are non-intrusive and are therefore more suitable in real-world settings compared to the intrusive methods like fingerprint and retina scans. This is a collaborative project that brings in the expertise from two different fields - face based recognition (at ASU) and speech based recognition (at Tecnologico de Monterey, Mexico).

SupervisorDr. Sethuraman Panchanathan , Department of Computer Science and Engineering, Arizona State University. 

4. Automated Malware Classification

The objective of this project was to use machine learning algorithms to classify a given file as clean or malware. 

SupervisorDr. Jay Stokes Security and Privacy Group, Microsoft Research, Redmond. 

5. Morphing 3D objects

This project  was aimed at developing an algorithm to iteratively morph two similar three dimensional objects. This can be used in scientific applications, for example, to morph the hippocampus of an Alzheimer's patient to that of a healthy individual. The algorithm was key-point less and no knowledge about the correspondence of landmark points was assumed. The morphing strategy was based on trilinear maps and was conceptually similar to the Iterative Closest Point (ICP) algorithm.  

Supervisor : Dr. Gerald Farin, Department of Computer Science and Engineering, Arizona State University.

6. Fake Document Identification using Image Processing Techniques

With the advent of cheap and sophisticated scanning and printing technologies, generating fraudulent documents using scanners and printers has seen a paradigm shift. The purpose of this project was to detect fake documents and also identify the specific scanners and printers used to produce them.  This is critical in preventing misuse of technology for malicious and unlawful gain.  Our experiments corroborated that statistics based on color and texture parameters of the original and the counterfeit images were effective in distinguishing them and also linking the fake documents to the specific devices used in their generation.

Supervisor: Dr. Chandan Mazumdar, Department of Computer Science and Engineering, Jadavpur University, Kolkata, India.

7. Neural Network Toolbox for Pattern Recognition

Developed a neural network toolbox for pattern classification for use in bioinformatics applications.

Supervisor: Dr. Rajat K. De, Machine Intelligence Unit (MIU), Indian Statistical Institute (ISI), Kolkata, India.