Projects

PolyDatabase

PolyDatabase

May 2025 - Nov 2025

A structured, AI-curated dataset of polymer molecular dynamics simulations extracted from scientific literature to support data-driven polymer informatics.

LLMs Python Django Information Extraction Molecular Dynamics
More Details

A curated dataset of polymer molecular dynamics simulations compiled from 823 peer-reviewed papers (1995–2025). Built through automated literature mining, classification, and information extraction, the PolyDatabase provides structured simulation metadata for data-driven polymer informatics and molecular modeling.

MonitorMyBug

MonitorMyBug

May 2025 - Nov 2025

An AI- and IoT-based pest monitoring system that uses real-time computer vision and environmental sensing to track ant activity and predict mealybug infestations in citrus orchards with over 90% detection precision.

Object Detction Edge AI IoT Raspberry Pi Remote Sensing
More Details

Smart Ant Monitoring is an integrated AI- and IoT-based pest surveillance system developed for citrus orchards. It combines real-time computer vision and environmental sensing to monitor ant activity and predict mealybug infestations—two pests that are closely correlated in orchard ecosystems. The system uses a Raspberry Pi 5 and a Google Coral USB Accelerator running YOLOv8-OBB models enhanced with SAHI slicing to achieve over 90% detection precision even under challenging field conditions. The device automatically detects and counts ants every hour (or as scheduled) and simultaneously records temperature, humidity, irrigation status, and soil moisture to enhance predictive modeling of mealybug presence. All collected data are automatically transmitted to a centralized database every 10 minutes, where they can be visualized and analyzed through a web portal that provides live tracking, analytics, and insights for effective data-driven pest management.

GNN for Molecular Encapsulation Prediction

GNN for Molecular Encapsulation Prediction

Oct 2025 - Feb 2026

A GNN-based regression model for molecular encapsulation prediction from coarse-grained (MARTINI) topology graphs to support rapid computational screening of host–guest affinity.

Message Passing Neural Network Molecular Dynamics Drug Encapsulation
More Details

This project provides a GNN-based regression model for predicting molecular encapsulation from coarse-grained (MARTINI) topology graphs. Molecular structures are represented as graphs from GROMACS ITP files, with nodes as beads, edges as bonds, and node/edge features from the NBFIX table (LJ parameters, charge, mass) and bond properties (length, force constant). A custom message-passing GNN aggregates local and global structure via graph-level descriptors (connectivity, charge distribution, bead type diversity) to predict encapsulation values. Implemented in PyTorch Geometric, the model achieves strong R² and RMSE on held-out compounds. An inference script supports batch predictions on new molecules, enabling fast computational screening of encapsulation candidates without full simulations.

Active Learning

Active Learning

May 2023 - Aug 2024

An active learning framework for efficiently sampling high-dimensional molecular dynamics data, enabling smarter and more targeted training of atomistic models.

Active Learning Sampling Methods Bash Python ML Potentials
More Details

Our ActiveLearningDeepMD framework is designed to iteratively select informative data points from first-principles molecular dynamics simulations in high-dimensional chemical space. Rather than exhaustively sampling all configurations, it uses uncertainty estimation and query strategies to focus computational effort where the model is least confident. The resulting subset of data is then used to train or refine deep potential models, leading to accurate atomistic representations with far fewer expensive simulation runs.

Piano Music Generation

Piano Music Generation

Oct 2024 - Dec 2024

An AI-based music composition system that generates piano pieces using Transformer networks trained on MIDI data. The model learns note sequences, timing, and dynamics to create original compositions from scratch.

Transformer Music Sequence Modeling PyTorch
More Details

Piano Music Generation employs an auto-regressive transformer architecture composed of stacked TransformerEncoder layers to model symbolic piano music. MIDI data are converted into discrete event tokens encoding note on/off, time shifts, and velocity changes. Using multi-head self-attention and positional encoding, the model captures long-range temporal and harmonic dependencies, enabling the generation of musically coherent and expressive piano compositions.