Hi! This is Yuwei.

A skilled developer with industry experience in programming and data analytics.
I’m passionate about solving complex problems and snowboarding.
I bring precision and resilience to all I do.

Technical Skills

Programming:

Python, Java, C++, Shell, HTML/CSS, SQL, Flask, RESTful API, Protobuf

Machine Learning & Data:

Scikit-learn, PyTorch, TensorFlow, NLP, MySQL, PostgreSQL, Redis

Technologies:

Linux, Docker, AWS, Spark, Slurm, Distributed Computing, Async I/O, Dependency Injection

Education

Duke University
Master of Interdisciplinary Data Science

Aug 2020 - May 2022
GPA: 3.83/4.0

Courses: Data Engineering Systems, Natural Language Processing, Modeling and Representation of Data, Data Analysis Scale in Cloud, Principles of Machine Learning, Computer Engineering Machine Learning and Deep Neural Nets

National University of Singapore
Coursework, Master of Computer Science

Jan 2020 - Jun 2020
GPA: 4.50/5.0

Courses: Knowledge Discovery and Data Mining, Neural Network and Deep Learning, Big Data Analytics Technology

Tsinghua University
Bachelor of Environmental Engineering

Aug 2011 – Jul 2015
GPA: 3.78/4.0

Courses: Fundamentals of Computer Software Techniques

Experience


Google. Google Photos
Software Engineer

Jul 2022 - now
Los Angeles, CA

Project: Movie Editor

  • Developed a novel movie editor, enhancing overall user experience.
  • Orchestrated the process of content curation, integrating algorithms to autonomously select optimal content for customized movies.

Facebook, Inc. Facebook AI
Software Engineer Intern

May 2021 - Aug 2021
Menlo Park, CA

Project: Automated Machine Translation Toolkit

  • Developed a multilingual machine translation toolkit compatible with various Pytorch models, with command-line tools for data pre-processing, training, evaluation.
  • Accelerated the speed of the entire pipeline by over 20 times via asynchronous processing
  • Integrated Hydra, OmegaConf to support configuration inheritance, override and sweep with YAML.
  • Instituted a code snapshotting system for consistency in long-term and large-scale distributed training.
  • Utilized Slurm to manage weeks-long training jobs in 300 languages and hundreds of TB data on large- scale compute clusters.

Dianrong Information Technology Ltd.
      —— top fintech company
Algorithm and R&D Engineer

Jun 2018 – Dec 2018
Chengdu, China

Project: PPDAI 3rd Magic Mirror Data Application Contest

  • Constructed a deep learning model using TensorFlow for pairwise question classification in customer chatbot scenario, ranked top 29% among 579 teams with accuracy up to 96.8%.
  • Amplified semantic similarity assessment accuracy from 86% to over 95% by deploying a Siamese Network with TensorFlow and fine-tuned on Linux servers.

Project: Tarot Divination App

  • Created a tarot divination app, scaling to 120,000 registered users without any marketing expenses.
  • Innovated social features and a game currency mechanism to drive user engagement, gained 2,500 daily unique visitors and 10,000 weekly unique visitors.
  • Enhanced user data security with MD5 encryption and improved database interoperability through SQLAlchemy for MySQL database interaction.
  • Devised an independent module for operation assistants, providing real-time user data insights.

Project: Customer Service Inspection System

  • Engineered a Smart Inspection System using Flask and Docker for seamless monitoring of customer service calls.
  • Implemented text classification and clustering algorithms with scikit-learn, facilitating efficient customer service review.
  • Established hierarchical permission levels to streamline team collaboration while ensuring individual team autonomy.
  • Implemented comprehensive unit tests and fostered strong collaboration with front-end and testing teams to guarantee system robustness.

Think Tank 2861 Info Tech
  —— governmental IT advisory
Algorithm and R&D Engineer

Jul 2017 – Apr 2018
Chengdu, China

Project: Public Opinion Analysis

  • Integrated public opinion analysis with resource distribution to evaluate County Government performance, resulting in a 90% positive response in user feedback.
  • Built and deployed a distributed, multi-threaded web crawler on Alibaba Cloud, successfully scraping 60 million reviews from major social media platforms and storing in PostgreSQL database.
  • Constructed a dynamic sentiment analysis model using logistic regression, Naïve Bayes, random forest, SVM, kneighbors, RNN, and other algorithms to optimize performance.
  • Devised quantitative indicators for evaluating resource distribution in 2,861 counties using Point-of-Interest (POI) data from open map APIs.

Project: Social Network Analysis

  • Established a hierarchical social network of government officials, utilizing data from 70,307 resumes to map out 144,504 interconnections.
  • Improved data integrity from 55.2% to 81.5% through advanced data cleaning techniques and an innovative inference strategy.
  • Devised a connection level classification system and built a recommendation engine using the principles of Six Degrees of Space Theory.
Download Resume

Contact Me

 thuzhangyw@gmail.com

Email Me

 919-939-4132

Call Me