Sourin Dey

Sourin Dey

Columbia, South Carolina · sourin@email.sc.edu

I am doing PhD in Computer Science at the University of South Carolina with Deep Learning and Data Science specialization

Internship Experience

Data Science Intern

Hexagon Manufacturing Intelligence

Graph Retrieval-Augmented Generation (Graph-RAG) pipeline using Open Source and Azure Models: I built a Q/A system to process and query 20 years of software manuals. Using LangChain framework, I converted document knowledge into a neo4j-based graph database to retrieve information that is contextually rich and useful enough to assist Applications Engineers. Delivered scalable knowledge access – Enabled engineers to efficiently search and extract insights of multi-fold difficulties (temporal, vague, reasoning) from extensive technical documentation.

Summer 2025

Data Science Intern

Dow Chemicals

Developed a Graph Neural Network (GNN) framework – Built a heterogeneous network graph for product recommendation using node classification. I ensured effective information flow in non-homogeneous graphs to improve recommendation accuracy. Researched Variational Autoencoder (VAE) applications – I explored generative modeling for novel product formulation based on graph structures. I also identified how large, dense graphs bias the latent space and hinder stable VAE training.

Summer 2024

Research Experience

PhD Research

Graduate Research Assistant
  • Equilibrium Matching based Generative model for Molecules and Materials: After pretraining a VAE, I fine-tune its encoder with a lightweight regression head to predict formation energies, aligning the latent space with the underlying energy landscape. This energy-aware alignment regularizes the latent space, guiding the Equilibrium model toward sampling thermodynamically stable and physically realistic structures.
  • Machine learning pipeline for crystal generation
  • Polyhedron topology based mapping algorithm for polymorphic crystal structures - I developed polyhedron connectivity based graph topology to cluster materials across diverse space groups, improving identification of structural similarities beyond symmetry-based methods. Paper link.
  • Polyhedron topology based mapping visualization
  • Developed a variant of Atomistic Line Graph Neural Network (ALIGNN) model by Δ-learning electronic structure of crystals to predict HSE eigenvalues, a key opto-electronic property. By leveraging inexpensive PBE calculations and orbital projections, I could build highly accurate ML model as surrogate for costly DFT calculation.
  • August 2021 - Present

    Research Experience

    MS Research

    Graduate Research Assistant

    I automated the AI powered Laser-Induced Graphene Process(LIG) manufacturing using Bayesian Optimization. The automated system is generalized and can be deployed to manufacture other materials.

    August 2019 - July 2021

    Undergraduate Thesis

    Formant-based Perceptual Space Classification is focused on detecting the Bengali vowel from continuous speech. High Accuracy by SVM RBF Kernel Classifier is gained. This will enhance the emotional state recognition research in the Bengali language.

    June 2017 - May 2018

    Education

    University of South Carolina

    Doctorate - Computer Science
    Selected Courses: Data Mining & Warehousing, Computer Processing of Natural Language, Neuromorphic Computing
    August 2021 - Present

    University of Wyoming

    Master - Computer Science
    Selected Courses: Intro to AI, Deep Reinforcement Learning & Control, Randomness in Computation
    August 2019 - July 2021

    Khulna University of Engineering & Technology

    Bachelor of Science - Electrical and Electronic Engineering
    Selected Courses: Digital Image Processing, Digital Signal Processing
    April 2014 - May 2018

    Skills

    Programming Languages & Tools
    • C
    • C++
    • High Performance Computing
    • Shell Scripting
    • Python (PyTorch, Pytorch-Geometric,Tensorflow, Deep Graph Library, GenSim, SpaCy), Pydantic AI, LangChain, LangGraph, Neo4j
    • R(mlrMBO,mlr,caret)
    • SQL
    • Github Copilot
    • Jupyter Notebook
    • VS Code
    • Algorithm & Data Structure coding in Leetcode
    • C++, C, Android Studio with Java Programming
    • Linux, Windows, Android
    • Git
    • Microsoft Office, LaTex

    Publications

    • Facet: highly efficient E (3)-equivariant networks for interatomic potentials
      arXiv preprint arXiv:2509.08418, 2025
      Authors: N Miklaucic, L Wei, R Dong, N Fu, SS Omee, Q Li, S Dey, V Fung, J Hu
    • Data-Driven Topological Analysis of Polymorphic Crystal Structures
      arXiv preprint arXiv:2508.10270, 2025
      Authors: S Dey, N Miklaucic, SS Omee, R Dong, L Wei, Q Li, N Fu, J Hu
    • Polymorphism crystal structure prediction with adaptive space group diversity control
      Advanced Science, 2025
      Authors: SS Omee, L Wei, S Dey, J Hu
    • Polymorphism Crystal Structure Prediction with Adaptive Space Group Diversity Control
      arXiv e-prints, arXiv: 2506.11332, 2025
      Authors: S Sadeed Omee, L Wei, S Dey, J Hu
    • Scalable deeper graph neural networks for high-performance materials property prediction
      Patterns 3 (5), 2022
      Authors: SS Omee, SY Louis, N Fu, L Wei, S Dey, R Dong, Q Li, J Hu
    • DeepXRD, a deep learning model for predicting XRD spectrum from material composition
      ACS Applied Materials & Interfaces 14 (35), 2022
      Authors: R Dong, Y Zhao, Y Song, N Fu, SS Omee, S Dey, Q Li, L Wei, J Hu
    • Optimizing laser-induced graphene production
      PAIS 2022, 2022
      Authors: L Kotthoff, S Dey, J Heil, V Jain, T Muller, A Tyrrell, H Wahab, P Johnson
    • Scalable deeper graph neural networks for high-performance materials property prediction. Patterns 3, 100491
      2022
      Authors: SS Omee, SY Louis, N Fu, L Wei, S Dey, R Dong, Q Li, J Hu
    • Investigation of Exploration-Exploitation Trade-off of Bayesian Optimization to Optimize the Fully Automated Laser-Induced Graphene Process
      2021
      Authors: S Dey
    • Formant based bangla vowel perceptual space classification using support vector machine and K-nearest neighbor method
      2018 21st International Conference of Computer and Information Technology, 2018
      Authors: S Dey, MA Alam
    • SAR reduction of bio-tissue in cellular communication
      2017 3rd International Conference on Electrical Information and Communication Technology, 2017
      Authors: S Dey, S Dey, AS Ahmed, MN Mollah

    Extracurriculars

    I have interests in wildlife photography. During my master's days, I explored rural Wyoming and countrysides of the colorful Colorado. Have a look of my stills of the places Wyoming Days!!

    Instructor, Dept. of Chemistry & Biochemistry, University of South Carolina. Worked as an instructor for Python Programming Summer Camp Workshop.