Sourin Dey

Columbia, South Carolina · sourin@email.sc.edu

I am doing PhD in Computer Science at the University of South Carolina with Deep Learning and Data Science specialization

Internship Experience

Data Science Intern at Hexagon Manufacturing Intelligence (Summer 2025)


Graph Retrieval-Augmented Generation (Graph-RAG) pipeline using Open Source and Azure Models: I built a Q/A system to process and query 20 years of software manuals. Using LangChain framework, I converted document knowledge into a neo4j-based graph database to retrieve information that is contextually rich and useful enough to assist Applications Engineers. Delivered scalable knowledge access – Enabled engineers to efficiently search and extract insights of multi-fold difficulties (temporal, vague, reasoning) from extensive technical documentation.

Data Science Intern at Dow Chemicals (Summer 2024)


Developed a Graph Neural Network (GNN) framework – Built a heterogeneous network graph for product recommendation using node classification. I ensured effective information flow in non-homogeneous graphs to improve recommendation accuracy. Researched Variational Autoencoder (VAE) applications – I explored generative modeling for novel product formulation based on graph structures. I also identified how large, dense graphs bias the latent space and hinder stable VAE training.

Research Interests

Optimization Techniques

Deep Learning & Neural Networks

Reinforcement Learning

Computer Vision

Human Robot Interaction


Education

University of South Carolina

Doctorate - Computer Science
Selected Courses: Data Mining & Warehousing, Computer Processing of Natural Language, Neuromorphic Computing
August 2021 - Present

University of Wyoming

Master - Computer Science
Selected Courses: Intro to AI, Deep Reinforcement Learning & Control, Randomness in Computation
August 2019 - July 2021

Khulna University of Engineering & Technology

Bachelor of Science - Electrical and Electronic Engineering
Selected Courses: Digital Image Processing, Digital Signal Processing
April 2014 - May 2018

Research Experience

PhD Research

Graduate Research Assistant
  • Equilibrium Matching based Generative model for Molecules and Materials: After pretraining a VAE, I fine-tune its encoder with a lightweight regression head to predict formation energies, aligning the latent space with the underlying energy landscape. This energy-aware alignment regularizes the latent space, guiding the Equilibrium model toward sampling thermodynamically stable and physically realistic structures.

  • Polyhedron topology based mapping algorithm for polymorphic crystal structures – I leveraged polyhedron connectivity to cluster materials across diverse space groups, advancing materials discovery by improving identification of structural similarities beyond symmetry-based methods.

  • Developed a variant of Atomistic Line Graph Neural Network (ALIGNN) model by Δ-learning electronic structure of crystals to predict HSE eigenvalues, a key opto-electronic property. By leveraging inexpensive PBE calculations and orbital projections, I could build highly accurate ML model as surrogate for costly DFT calculation.
  • August 2021 - Present

    Research Experience

    MS Research

    Graduate Research Assistant

    I automated the AI powered Laser-Induced Graphene Process(LIG) manufacturing using Bayesian Optimization. The automated system is generalized and can be deployed to manufacture other materials.

    August 2019 - July 2021

    Undergraduate Thesis

    Formant-based Perceptual Space Classification is focused on detecting the Bengali vowel from continuous speech. High Accuracy by SVM RBF Kernel Classifier is gained. This will enhance the emotional state recognition research in the Bengali language.

    June 2017 - May 2018

    Skills

    Programming Languages & Tools
    • C
    • C++
    • High Performance Computing
    • Shell Scripting
    • Python (PyTorch, Pytorch-Geometric,Tensorflow, Deep Graph Library, GenSim, SpaCy), Pydantic AI, LangChain, LangGraph, Neo4j
    • R(mlrMBO,mlr,caret)
    • SQL
    • Github Copilot
    • Jupyter Notebook
    • VS Code
    • Algorithm & Data Structure coding in Leetcode
    • C++, C, Android Studio with Java Programming
    • Linux, Windows, Android
    • Git
    • Microsoft Office, LaTex

    Publications

    • Facet: highly efficient E (3)-equivariant networks for interatomic potentials
      arXiv preprint arXiv:2509.08418, 2025
      Authors: N Miklaucic, L Wei, R Dong, N Fu, SS Omee, Q Li, S Dey, V Fung, J Hu
    • Data-Driven Topological Analysis of Polymorphic Crystal Structures
      arXiv preprint arXiv:2508.10270, 2025
      Authors: S Dey, N Miklaucic, SS Omee, R Dong, L Wei, Q Li, N Fu, J Hu
    • Polymorphism crystal structure prediction with adaptive space group diversity control
      Advanced Science, 2025
      Authors: SS Omee, L Wei, S Dey, J Hu
    • Polymorphism Crystal Structure Prediction with Adaptive Space Group Diversity Control
      arXiv e-prints, arXiv: 2506.11332, 2025
      Authors: S Sadeed Omee, L Wei, S Dey, J Hu
    • Scalable deeper graph neural networks for high-performance materials property prediction
      Patterns 3 (5), 2022
      Authors: SS Omee, SY Louis, N Fu, L Wei, S Dey, R Dong, Q Li, J Hu
    • DeepXRD, a deep learning model for predicting XRD spectrum from material composition
      ACS Applied Materials & Interfaces 14 (35), 2022
      Authors: R Dong, Y Zhao, Y Song, N Fu, SS Omee, S Dey, Q Li, L Wei, J Hu
    • Optimizing laser-induced graphene production
      PAIS 2022, 2022
      Authors: L Kotthoff, S Dey, J Heil, V Jain, T Muller, A Tyrrell, H Wahab, P Johnson
    • Scalable deeper graph neural networks for high-performance materials property prediction. Patterns 3, 100491
      2022
      Authors: SS Omee, SY Louis, N Fu, L Wei, S Dey, R Dong, Q Li, J Hu
    • Investigation of Exploration-Exploitation Trade-off of Bayesian Optimization to Optimize the Fully Automated Laser-Induced Graphene Process
      2021
      Authors: S Dey
    • Formant based bangla vowel perceptual space classification using support vector machine and K-nearest neighbor method
      2018 21st International Conference of Computer and Information Technology, 2018
      Authors: S Dey, MA Alam
    • SAR reduction of bio-tissue in cellular communication
      2017 3rd International Conference on Electrical Information and Communication Technology, 2017
      Authors: S Dey, S Dey, AS Ahmed, MN Mollah

    Extracurriculars

    I have interests in wildlife photography. During my master's days, I explored rural Wyoming and countrysides of the colorful Colorado. Have a look of my stills of the places Wyoming Days!!

    Instructor, Dept. of Chemistry & Biochemistry, University of South Carolina. Worked as an instructor for Python Programming Summer Camp Workshop.