SENIOR SOFTWARE ENGINEER, SKIA INFRASTRUCTURE AND PERFORMANCE ANALYSIS
9+ years of R&D experience focusing on Big Data Analysis for large distributed system diagnostics and optimization, on various projects for the world's leading IT company.
7 years of Postgraduate Research on High Speed and Optical Networks.
Strong analytical skills for data-driven insights and decision-making. Hands-on experience on System Simulations, backend data collection, organization, frontend presentation and monitoring/alerting pipeline automation.
Experienced in fast prototyping design to test novel ideas, and converting them to production. Make complex judgments and develop innovative solutions. Effective in leading technical projects at strategic level.
Senior Software Engineer with a wide range of applied statistics experience. Quick adapter on new technologies, systems and tools. Excellent in technical writing and presentations / communications.
Web Frontend/Backend with SQL databases, Google Web Toolkit, JS frontend with Data Visualization tools
Data Mining/Machine Learning Statistical Analysis on Distributed BigData with MapReduce (Hadoop)
Cloud technologies such as Google's App Engine, BigQuery, Cloud Storage, Cloud Compute, Distributed Database
Effective technical leadership, including leading and managing team of multidisciplinary members
Google, Inc.September 2012 to CurrentSenior Software Engineer, Skia Infrastructure and Performance Analysis Chapel Hill, NC
Skia is Chrome's 2D graphics engine. The Infrastructure Team supports automated code testing with BuildBots, performance analysis and anomaly alerting.
Led the project to analyze speed performance benchmarks to find correlations and representation sampling. Used the results to build perf tracking dashboards that gave timely visual feedback.
Developed dashboards using Google Web Toolkit, AppEngine, distributed MySQL, BigQuery, Cloud Storage, Cloud Compute and customized storage schema for fast large-scale analysis.
Co-designed and developed a novel automated perf alerting system, which uses K-Means clustering (unsupervised machine learning) and StepFit regression to quantify anomaly degrees for alerting.
Other infrastructure supports for automated data collection on Linux/Windows/Mac/Android/iOS platforms.
Google, Inc.October 2008 to September 2012Senior Software Engineer, Search Infrastructure Mountain View, CA
Analysis on the lower-level web crawl and indexing infrastructure, to support launch decisions and find areas of improvements. Tech lead for teams of Engineers and Statisticians that design and maintained about a dozen metrics, and performed numerous research tasks with internal reports. Selected projects:
Studied on the performance and characteristics of existing Logistic Regression based Index Selection techniques. Found inefficiency in the model and training process. Later helped the Selection team evaluate and fine-tune a simpler but better prediction model.
Analyzed and helped restrict/remove semi-manual part of index selection from various feed sources, to achieve more automated quality balance with minimum intervention.
Created an Indexing Simulation Framework. It had 5 large modules and Engineering teams involved. I initiated numerous discussion sessions that collected and analyzed requests and challenges, and chose an approach to create a system flexible enough to adapt to all requirements in the comprehensive end-to-end system.
Designed and implemented PageRank metrics, which measures PageRank shifting over time on different verticals, and the quality of PageRank calculations. It corrected language bias in the evaluation process.
Developed metrics and interactive dashboard to monitor/evaluate crawl/indexing timeline, which helped identify and debug infrastructure pipeline bottlenecks for better throughput.
Prototyped Asian News Monitoring system for evaluating discovery and indexing of Asian news results. Later worked with the News team to extend its functionality.
Designed and implemented Crawl Scheduling metrics to track docs in crawl feeds over time, with interactive diagnostics on the system's efficiency.
Studied on Chinese language segmentation quality analysis on search index against major competitors.
Co-developed and maintained Search Infrastructure related metrics dashboard system, which served the whole Search team on data-driven analysis and improvements.
Google, Inc.September 2005 to September 2008Software Engineer, Search Quality Mountain View, CA
Quality analysis and monitoring on Google Web Search results. Selected projects:
Researched on historical search freshness data. Used language/encoding detection and similarity quantification (SimHash and alternatives) techniques for automatic accurate measurements.
Prototyped the search result freshness evaluation systems, and extended them to support major competitor comparisons on various search languages. Later implemented continuous eval in production.
Researched on webpage change rate distributions, which optimized tiered content refresh strategies.
Designed and implemented system for monitoring duplicate snippets from Chinese search results.
Researched on competitor crawl behaviors using public web data, created extensible prototypes, and later productionized it for daily monitoring.
Created Discovery metrics, for measuring coverage and latency on finding out new webpages.
As one of a dozen Android Pioneers, wrote a Gomoku game on the pre-launch Android platform.
One-off research on: Compare Ranking formula (Kendall's Tau based); cache result click rate; search result coverage measurements; social media search coverage; crawl bandwidth limits; use of sitemaps on coverage improvements.
North Carolina State University2005Ph.D.: Computer ScienceRaleigh, NC, USA
Thesis: Hierarchical Traffic Grooming in Large-Scale WDM Optical Networks
Analysis on the NP-Completeness of Traffic Grooming problem and bounds. Used CPLEX Integer Linear Programming to solve special cases. Designed and simulated fast heuristic algorithms that generated near-optimal solutions for special and general network topologies.
Chinese Academy of Sciences, Institute of Software2001Master of Science: Computer Engineering, Multimedia Communication & Network Engineering LabBeijing, China
Thesis: Design and Implementation of Billing Systems in ISBN Broadband Routers
Created a billing system on top of SNMP Protocol and embedded Linux for the research lab's proprietary ISBN routers, with GUI frontend on Windows.
Beijing Jiaotong University1998Bachelor of Science: Management Information Systems (MIS)Beijing, China