Hi! This is Yuwei.
A skilled developer with industry experience in programming and data analytics.
I’m passionate about solving complex problems and snowboarding.
I bring precision and resilience to all I do.
Technical Skills
Programming: |
Python, Java, C++, Shell, HTML/CSS, SQL, Flask, RESTful API, Protobuf |
Machine Learning & Data: |
Scikit-learn, PyTorch, TensorFlow, NLP, MySQL, PostgreSQL, Redis |
Technologies: |
Linux, Docker, AWS, Spark, Slurm, Distributed Computing, Async I/O, Dependency Injection |
Education
|
Duke University |
Aug 2020 - May 2022 |
Courses: Data Engineering Systems, Natural Language Processing, Modeling and Representation of Data, Data Analysis Scale in Cloud, Principles of Machine Learning, Computer Engineering Machine Learning and Deep Neural Nets |
||
|
National University of Singapore |
Jan 2020 - Jun 2020 |
Courses: Knowledge Discovery and Data Mining, Neural Network and Deep Learning, Big Data Analytics Technology |
||
|
Tsinghua University |
Aug 2011 – Jul 2015 |
Courses: Fundamentals of Computer Software Techniques |
Experience
|
Google. Google Photos |
Jul 2022 - now |
Project: Movie Editor
- Developed a novel movie editor, enhancing overall user experience.
- Orchestrated the process of content curation, integrating algorithms to autonomously select optimal content for customized movies.
|
Facebook, Inc. Facebook AI |
May 2021 - Aug 2021 |
Project: Automated Machine Translation Toolkit
- Developed a multilingual machine translation toolkit compatible with various Pytorch models, with command-line tools for data pre-processing, training, evaluation.
- Accelerated the speed of the entire pipeline by over 20 times via asynchronous processing
- Integrated Hydra, OmegaConf to support configuration inheritance, override and sweep with YAML.
- Instituted a code snapshotting system for consistency in long-term and large-scale distributed training.
- Utilized Slurm to manage weeks-long training jobs in 300 languages and hundreds of TB data on large- scale compute clusters.
|
Dianrong Information Technology Ltd. |
Jun 2018 – Dec 2018 |
Project: PPDAI 3rd Magic Mirror Data Application Contest
- Constructed a deep learning model using TensorFlow for pairwise question classification in customer chatbot scenario, ranked top 29% among 579 teams with accuracy up to 96.8%.
- Amplified semantic similarity assessment accuracy from 86% to over 95% by deploying a Siamese Network with TensorFlow and fine-tuned on Linux servers.
Project: Tarot Divination App
|
|
Project: Customer Service Inspection System
- Engineered a Smart Inspection System using Flask and Docker for seamless monitoring of customer service calls.
- Implemented text classification and clustering algorithms with scikit-learn, facilitating efficient customer service review.
- Established hierarchical permission levels to streamline team collaboration while ensuring individual team autonomy.
- Implemented comprehensive unit tests and fostered strong collaboration with front-end and testing teams to guarantee system robustness.
|
Think Tank 2861 Info Tech |
Jul 2017 – Apr 2018 |
Project: Public Opinion Analysis
- Integrated public opinion analysis with resource distribution to evaluate County Government performance, resulting in a 90% positive response in user feedback.
- Built and deployed a distributed, multi-threaded web crawler on Alibaba Cloud, successfully scraping 60 million reviews from major social media platforms and storing in PostgreSQL database.
- Constructed a dynamic sentiment analysis model using logistic regression, Naïve Bayes, random forest, SVM, kneighbors, RNN, and other algorithms to optimize performance.
- Devised quantitative indicators for evaluating resource distribution in 2,861 counties using Point-of-Interest (POI) data from open map APIs.
Project: Social Network Analysis
- Established a hierarchical social network of government officials, utilizing data from 70,307 resumes to map out 144,504 interconnections.
- Improved data integrity from 55.2% to 81.5% through advanced data cleaning techniques and an innovative inference strategy.
- Devised a connection level classification system and built a recommendation engine using the principles of Six Degrees of Space Theory.