Skip to main content

SUMMARY

  • Machine Learning Engineer in Biotech
  • MS in Computer Science
  • PhD in Computational Chemical Physics

WORK HISTORY

Computational Protein / Machine Learning Engineer, Aether Biomachines, Menlo Park, CA [2021-Current]

  • Machine Learning for Protein Sequence/Structure/Function Prediction & Generation

    1. Built content-based recommender system : recommended chemical reaction pattern similarity-based protein/enzyme sequences; expedited protein/enzyme selection period from one month to one hour; resulted in actuating five new projects in two weeks (Python, Scikit-Learn, Numpy)
    2. Created prompt-engineered sequence generator by Transformer GPT2 : trained GPT2 from scratch with BPE tokenizer (chemical reaction and sequence motif-unit tokens); elicited chemical reaction-prompted, diffused protein/enzyme sequences; expanded chemical/biological space to explore for protein design without extensive prior structural knowledge (Pytorch, Hugging Face Transformer, ESMfold)
    3. Implemented human-in-the-loop training strategy for Active Learning : improved predictive model accuracy by 10% by redesigning iterative and interactive training strategy querying low confidence data first from the unlabeled dataset and iterative training with updated trainset (Python, Scikit-Learn, ModAL)
  • Software Engineering for Computational Structural Biology Tools

    1. Structured and integrated chem/bio/ML pipelines/toolchains and deployed them in Docker (RDkit, CoLab Hallucination/Diffusion, etc)
    2. Deployed end-to-end 3D simulation tool from simple JSON inputs using AmberTools/openMM
    3. Developed automatic catalytic site identification tool in conjunction with ESMFold, AlphaFold2

Postdoctoral Fellow (Computational Simulation/Modeling), Novartis Institutes for BioMedical Research, San Diego, CA [2012-2016]

  • Developed Rational Design Model: developed a simple, generic yet robust biological simulator outperforming non-profit software by accelerating computation time (GPU-enabled simulation), generating/incorporating ensemble features, and benchmarking against a wide range of biologics systems; enabled the rational experimental design to cut time/labor/budget (Python, R, SQLite, GPU-enabled MD suite compile/simulation)

  • Deployed End-to-End Integrated Tool: deployed high-confidence, full de novo sequencing tool for antibody biotherapeutics by utilizing public database and reconstructing short tags by Dynamic Programming; provided fully functional in-house service at no risk of confidential information disclosure by leveraging combined information protocol (genomics/proteomics/informatics) (Python, R (ggplot2), Database Retrieval)

Postdoctoral Fellow (Computational Biophysics), Beckman Research Institute at City of Hope, Duarte, CA [2010-2012]

  • Optimized the high-dimensional biological sampling space by refinement protocol, which was enhanced by reusing the robotics locomotion algorithm used in NASA/JPL.

EDUCATION

MS. Computer Science (3.95/4.0), San Jose State University, CA, 2020

PhD. Chemical Physics (Computational Drug Design), The Ohio State University, OH, USA, 2010

BS. Chemistry and Physics, SookMyung Women’s University, Seoul, Korea


COMPUTER SKILLS

  • Programming Languages –— [Advanced] Python, Pandas, R, Bash, MySQL and LaTeX — [Skilled] Java, C and Verilog with ModelSim
  • ML / Deep Learning — Scikit-learn, PyTorch; [Deep Reinforcement Learing] Ray RLlib, OpenAI Gym
  • Computational Chemistry/Simulation/Modeling Tools — RDkit, OpenMM, Amber, Schrodinger, ChimeraX
  • Computing Environment –— Git, Jupyter, Linux, HPC (MPI), GPU Computing, AWS, Conda and Docker
  • Full Stack Web —– [Front] HTML, CSS, Bootstrap, Hugo, Javascript/jQuery and Ajax; [Server] JSP/Tomcat
  • Social Network – Gephi, iGraph
  • Computational Simulation & Modeling Tools — RDkit, OpenMM, Amber, Schrodinger and ChimeraX

COURSEWORK in COMPUTER SCIENCE

Graduate Courses

  • Design and Analysis of Algorithms
  • Practical Computer Vision using Convolutional Neural Networks
  • Cryptography and Computer Security
  • Topics in Machine Learning
  • Topics in Wireless Mobile Networking
  • Web Intelligence (Mining of Massive Dataset)
  • Social Network Analysis
  • Applied Probability and Statistics

Undergraduate Courses

  • Discrete Mathematics
  • Introduction to Computer Systems
  • Data Structures and Algorithms
  • Operating Systems
  • Computer Architecture
  • Object-Oriented Design
  • Introduction to Database Management Systems

CS/ML PROJECTS in AI, NLP, SOCIAL NETWORK & WEBAPP

Multi-Agent Deep Reinforcement Learning for Walkers (RayRLlib,Python)

(1) Found optimal minibatch size and sampling reuse ratio improve performance of PPO algorithm. (2) Proposed efficient DRL training strategies of transfer learning and parameter sharing for the robot-legs walkers carrying a package:

Semantic Textual Similarity using Transfer Learning and Embeddings (TensorFlow,Python)

Build a Semantic Search Engine as a pragmatic application of Semantic Textual Similarity combined with a faster similarity search method of approx- imate k-nearest neighbor: https://bit.ly/2Fjzj4j

Sentiment Analysis using Machine Learning and Deep Learning (TensorFlow,Python,C)

Contrary to expectation, the primitive TF-IDF with Naive Bayes classification (80%) outperforms the state-of-art contextualized embedding NLP model of BERT (75%) in capturing the simple polarity of lecture evaluation reviews: https://bit.ly/37zkpTM

Lighter, Faster Semantic Segmentation by Post-Training Quantization and Quantization-Aware Training (TensorFlow,Python)

Reduced neural network model-size while retaining inference accuracy us- ing various quantization workflows with MobileNet-based semantic seg- mentation framework of DeepLabV3+: https://bit.ly/39C1KbM

Deep Reinforcement Learning for Resource Management in 5G (RayRLlib)

DRL is an intelligent network slicing tool to optimize service allocation by maximize the targeted reward function (spectrum efficiency, quality of ex- perience etc.) https://bit.ly/380ROtB

Power of BTS ARMY for Social Change Envisaged by Twitter Network (Gephi,iGraph,R,Python)

Graph Theory based structural analysis & Semantic Network based behavioral analysis for the 24 hours of #MatchAMillion Twitter dataset. Bilingual translators play a crucial role as Local Bridges connecting fandom vs. non- fandom communities: https://bit.ly/3n4cGp6

Grocery Nutrient Tracker (GNT)-Market Full Stack WebApp (MySQL, Bootstrap, Java, JSP)

(1) Build a full-stack WebApp (Bootstrap for front-end, JSP/Tomcat for Web Server, MySQL for back-end) for convenient and healthy selection for gro- cery list. (2) Users can generate a personalized grocery list by selecting a dish menu and by specifying dietary restriction: https://bit.ly/33XhK7O

Data Compression using Dictionary-based Technique LZW and Its Variant to GIF Image Data Compression (Python)

Implement the LZW compression algorithm, which is lossless, greedy, and adaptive dictionary data structure by taking advantage of the frequency redundancy of character-stream: https://bit.ly/3n2qlvY

The BlowFish Block Cipher System – Expensive Key Scheduling (Java)

Implement one of the Feistel Block Ciphers, BlowFish, using a built-in functionalities from the cryptoUtil. A larger size of subkey space is time-consuming prior to encryption, but increases security by making brute-force attempts infeasible: https://bit.ly/3hwLU6H


SELECTED PUBLICATIONS

  • S. Kyriacou, G. Meshulam-Simon, L. Clark, C. Chia and Inhee Park, “Enzyme engineering by means of reaction space indexing coupled with machine learning algorithms”, Society for Industrial Microbiology and Biotechnology (SIMB), Aug 2022, San Francisco
    #enzyme engineering, #graph neural network (gnn), #machine learning, #automation

  • Inhee Park and Teng-Sheng Moh, “Multi-Agent Deep Reinforcement Learning for Walker Systems”, 2021 20th IEEE International Conference on Machine Learning and Applications (ICMLA), 2021, pp. 490-495, doi: 10.1109/ICMLA52953.2021.00082.
    #deep reinforcement learning (drl), #ai, #multi-agent

  • Inhee Park, J. Venable, C. Steckler, S. Cellitti, S.A. Lesley, G. Spraggon and A. Brock, “Estimation of Hydrogen Exchange Protection Factors from MD Simulation Based on Amide Hydrogen Bonding Analysis”. J Chem Inf Model. 55(9):1914-25. 2015
    #monte carlo, #mass spectrometry, #proteomics, #molecular simulation, #computational structural biology

  • Inhee Park and A. Brock, “Genomics/Proteomics/Informatics Complementary Approach to High Confident Reconstruction of Antibody V-region Sequence” ACS West Coast Analytical Chemistry Symposium, Apr 2015, San Diego
    #bioinformatics, #genomics, #proteomics, #mass spectrometry, #dynamic programming

  • S. Bhowmik, D. H. Jones, H.-P. Chiu, Inhee Park et al. “Structural and Functional Characterization of BaiA, An Enzyme Involved in Secondary Bile Acid Synthesis in Human Gut Microbe”. Proteins. 82(2):216- 29. 2014
    #quantum chemistry, #catalytic activity

  • Inhee Park, V. Gangupomu, J. Wagner, A. Jain and N. Vaidehi, “Structure Refinement of Protein Low Resolution Models Using the GNEIMO Constrained Dynamics Method”. J Phys Chem B. 116(8):2365-75. 2012
    #homology modeling, #sampling, #molecular simulation, #free energy, #computational structural biology

  • A. Jain, Inhee Park and N. Vaidehi, “Equipartition Principle for Internal Coordinate Molecular Dynamics”. J Chem Theory Comput. 8(8):2581-2587. 2012
    #free energy

  • V. Gangupomu, J. Wagner, Inhee Park et al. “Mapping Conformational Dynamics of Proteins using Torsional Dynamics Simulations” Biophys J. 104(9):1999-2008. 2013
    #molecular simulation, #free energy, #computational structural biology

  • G.S. Balaraman, Inhee Park et al. “Folding of Small Proteins Using Constrained Molecular Dynamics” J Phys Chem B. 115(23):7588-96. 2011
    #protein folding, #molecular simulation, #free energy, #computational structural biology

  • W. Harvey, Inhee Park, O. Rubel, V. Pascucci, P.T. Bremer, C. Li and Y. Wang, “A Collaborative Visual Analytics Suite for Protein Folding Research” J Mol Graph Model. 53:59-71.2014
    #graph, #folding, #molecular simulation, #free energy, #computational structural biology

  • Inhee Park and C. Li, “Dynamic ligand-induced-fit simulation via enhanced conformational samplings and ensemble dockings: a survivin example” J Phys Chem B. 114(15):5144-53. 2010
    #induced-fit-docking, #molecular simulation, #free energy, #enhanced sampling

  • Inhee Park and C. Li, “Characterization of molecular recognition of STAT3 SH2 domain inhibitors through molecular simulation” J Mol Recognition. 24(2):254-65. 2010
    #docking, #molecular simulation, #quantum chemistry, #free energy, #cheminformatics

  • S. Chattier, J. V. Cooley, Inhee Park et al. “Design, Synthesis and Biological Studies of Survivin Dimerization Modulators that Prolong Mitotic Cycle” Bioorg Med Chem Lett. 23(19):5429-33. 2013
    #in silico design, #docking, #molecular simulation, #cheminformatics, #free energy

  • Co-authored with 16 authors, “Impairment of Glioma Stem Cell Survival and Growth by a Novel Inhibitor for Survivin–Ran Protein Complex” Clin Cancer Res. 19(3):631-42. 2013
    #in silico design, #drug repurposing, #translation of breast cancer lead to brain cancer