Hello! I’m Dongyu, currently working at ByteDance Inc. in the AML-Enterprise-Machine Learning Platform. I earned my PhD in Data Science from Worcester Polytechnic Institute (WPI), where I conducted research in the DAISY Lab under the supervision of Prof. Elke Rundensteiner. My research expertise lies in Large Language Models (LLM), prompt engineering, and natural language processing (NLP).
Publications
For a list of my publications, please visit this page.
Work Experience
- ByteDance Inc, San Jose, CA
- Jul. 2024 - Present
- LLM Researcher/Engineer
Education
- Worcester Polytechnic Institute (WPI), Worcester, MA, USA
- Aug. 2018 – May. 2024
- Ph.D. in Data Science
- The University of Texas at Dallas (UTDallas), Richardson, TX, USA
- Aug. 2016 – May. 2018
- M.Science in Business Analytics
- Beijing Forestry University (BJFU), Beijing, China
- Sep. 2012 – Jun. 2016
- B.Economics in Statistics
- B.Science in Computer Science & Technology (Minor)
Research Experience
- Fact: Innovative Big Data Analytics Technology for Microbiological Risk Mitigation Assuring Fresh Produce Safety, WPI & University of Illinois Urbana-Champaign
Sep. 2020 – Present
Advisor: Prof. Elke Rundensteiner & Prof. Hao Feng
- Created Tweet-FID, the first public dataset for multi-task foodborne illness detection.
- Employed LLMs, using the Chain of Thought method, to generate labels for unlabeled tweet data.
- Developed a multi-task framework to extract information related to foodborne illness incident.
- Mentored a team of 6 in the Major Qualifying Project annually.
- Explainable Text Classification with Limited Human Guidance, WPI
Jun. 2020 – Feb. 2021
Advisor: Prof. Elke Rundensteiner
- Introduced the novel problem of text classification with limited human attention supervision.
- Devised a transformer-based multi-task architecture, HELAS, to address this novel problem.
- Time-Aware Network for Clinical Notes Series Prediction, WPI
Sep. 2019 – May. 2020
Advisor: Prof. Elke Rundensteiner
- Crafted a transformer-based method to classify a patient’s health state based on clinical note series.
Internship Experience
- Research Scientist Intern, Visa Inc.
May. 2022 – Aug. 2022
- Developed a hierarchical transformer approach for transaction sequence representation learning.
- Utilized this method for pretraining and finetuning on fraudulent transaction detection task.
- Data Analyst Intern, Shanghai PnR Data Service Co., Ltd.
May. 2017 – July. 2017
- Optimized a method for posts’ themes classification and replies’ emotion tendency analysis.
- Refined bank card character extraction algorithm.
- Developed a method for information extraction from National ID card.
Awards
Academic Excellence for the 2019-2020 Academic Year, Jun 2020
Useful Links
WPI
DAISY Lab