About

A mathematics graduate diving in data and computer science. Featuring new technologies and be able to pick up and apply them in a short time. From t-test to deep learning; from infrastructure to micro-services.

I graduated from the University of Hong Kong in Mathematics. Soon after graduation, I joined the fantastic team involving cloud infrastructure building and data analytics for the first online counselling platform in Hong Kong, OpenUp. Experienced in using Chef Infra to set up virtual machines on Azure, networking routing, ports configuration to setting up MongoDB, analytics platform like Jupyter Lab and RStudio Server, visualisation platform like Shiny Dashboard. Also performed statistical analysis on the data. Applying a zero-truncated one-inflated negative binomial regression on count to identify repeated users, helping service team in manpower allocation.

Moved to the UK in Tailify Software. Handled ten of millions rows of data with PySpark. Applied naive bayes and softmax regression in Keras to build machine learning models to predict the demographics of audiences. Deployed the models via FastAPI. Applying mathematical knowledge in real word business cases - enhancing existing cosine similarity matching model by adding a correlation matrix in the middle:

\[\text{similarity} = \mathbf{x}^T K \mathbf{y} = \mathbf{x}^T K^{1/2} K^{1/2} \mathbf{y} = (K^{1/2} \mathbf{x})^T (K^{1/2} \mathbf{y}),\] where \(K\) is the correlation matrix of the features in both \(\mathbf{x}\) and \(\mathbf{y}\) (note that \(K^{1/2}\) is the unique symmetric matrix such that \(K^{1/2} K^{1/2} = K\) because \(K\) is positive definite). This modified matching model no longer assume the features are completely independent to each other and has matched 10-15% more top performers (in terms of conversion rate, depends on brand). Also wired Google Drive, Google Sheets, ElasticSearch, PostgreSQL, Slack together to build micro-services for the client team to lookup the above matching score. Also exposed to dashboard building in Grafana and Google Looker.



Wilson Yip