Hi! I am Ayush a CS Senior at Indian Institute of Technology Roorkee. I am currently interseted in Deep learning and am also exploring a lot of other stuff ranging from robotics, finance, blockchain to pyschology or philosphy. At IIT Roorkee, I am also one of the co-presidents of a deep learning research centric discussion group called the Vision and Language Group and am also the Chair of ACM IIT Roorkee Student Chapter and an executive member of the newly formed Quantum Computing Group, under ACM IITR.
I am currently working as an undergraduate research intern at RVL Lab at University Of Toronto led by Prof. Florian Shkurti. In the past I have also had the pleasure of being part of various deep learning research labs, namely Video Analytics Lab at IISc Bangalore led by Prof. Venkatesh Babu, MIDAS IIITD led by Prof. Rajiv Ratn Shah and Prof. Katerina Fragkiadaki’s lab at MLD CMU. I also worked as a Data and Applied Scientist Intern at Micrsoft IDC Hyderabad in summer 2021.
Professional stuff aside, I am a die-hard anime fan, an avid reader, occassional writer ( checkout my Medium account :P ) and a lifelong learner who’s always up for a good conversation.
B.Tech. in Computer Science, 2022
Indian Institute Of Technology Roorkee
In the financial realm, profit generation greatly relies on the complicated task of stock prediction. Lately, neural methods have shown success in exploiting stock affecting signals from textual data across news and tweets to forecast stock performance. However, the dynamic, stochastic, and variably influential nature of text and prices makes it difficult to train neural stock trading models, limiting predictive performance and profits. To transcend this limitation, we propose a novel multimodal curriculum learning approach: FinCLASS, which evaluates stock affecting signals via entropybased heuristics and measures their linguistic and price-based complexities in a time-aware, hierarchical fashion. We show that training financial models can benefit by exposing neural networks to easier examples of stock affecting signals early during the training phase, before introducing samples having more complex linguistic and pricebased temporal variations. Through experiments on benchmark English tweets and Chinese financial news spanning two major indexes and four global markets, we show how FinCLASS outperforms state-of-the-art across financial tasks of stock movement prediction, volatility regression, and profit generation. Through ablative and qualitative experiments, we set the case for FinCLASS as a generalizable framework for developing natural language-centric neural models for financial tasks.
We propose an unsupervised method for detecting and tracking moving objects in 3D, in unlabelled RGB-D videos. The method begins with classic handcrafted techniques for segmenting objects using motion cues: we estimate optical flow and camera motion, and conservatively segment regions that appear to be moving independently of the background. Treating these initial segments as pseudo-labels, we learn an ensemble of appearance-based 2D and 3D detectors, under heavy data augmentation. We use this ensemble to detect new instances of the ‘moving’ type, even if they are not moving, and add these as new pseudo-labels. Our method is an expectation-maximization algorithm, where in the expectation step we fire all modules and look for agreement among them, and in the maximization step we re-train the modules to improve this agreement. The constraint of ensemble agreement helps combat contamination of the generated pseudo-labels (during the E step), and data augmentation helps the modules generalize to yet-unlabelled data (during the M step). We compare against existing unsupervised object discovery and tracking methods, using challenging videos from CATER and KITTI, and show strong improvements over the state-of-the-art.
Stock price movement and volatility prediction aim to predict stocks' future trends to help investors make sound investment decisions and model financial risk. Companies' earnings calls are a rich, underexplored source of multimodal information for financial forecasting. However, existing fintech solutions are not optimized towards harnessing the interplay between the multimodal verbal and vocal cues in earnings calls. In this work, we present a multi-task solution that utilizes domain specialized textual features and audio attentive alignment for predictive financial risk and price modeling. Our method advances existing solutions in two aspects: 1) tailoring a deep multimodal text-audio attention model, 2) optimizing volatility, and price movement prediction in a multi-task ensemble formulation. Through quantitative and qualitative analyses, we show the effectiveness of our deep multimodal approach.