Portfolio

4 Jun 2024

Budgeting App

Github 🔗 On ongoing project of mine, taking inspriation from Monzo’s budgeting burndown feature. I’m building this web app with Ruby-on-Rails, and using the GoCardless API to handle linking user’s bank accounts to the app securely. The inspiration for the project: monzo’s ’targets’ tab. So far, I am building the MVP, and have implemented functionality to link a bank account with the app, and to store the user’s key to access their bank account data using server-side sessions, which are much more secure than cookies sesion storage.

4 Jun 2024

Fine-tuning GPT2-2

Github 🔗 A brief jupyter notebook I made showing how to fine tune a model using the 🤗 Transformers libary. The example I wrote uses the popular CNN/DailyMail dataset.

3 Jun 2024

Youtube2Summary

Github 🔗 🤗 Pipeline to generate summaries of youtube videos, using Whisper-Small for transcription, and BART-LARGE-XSUM for summarisation. BART has been finetuned on the popular CNN/Daily Mail Dataset, as it lends itself to summarisation tasks. Initially, we attempted to fine-tune GPT-2 for the summarisation task, but found it had poor performance: being a generative transfotmer, it generates words one-by-one, (extractive summarisation) whereas BART can generate at the sentence level (using abstractive summarisation).

16 May 2023

Causal Implicit GAN: Data Augmentation for Causal Discovery

Github 🔗 My University dissertation research project, where I designed and trained a GAN model for data augmentation (generating new training samples for downstream models). I was very proud to receive a score of 83 on this disseration (high 1st). A high-level overview of the CIGAN project The data the GAN generates is intended for use on Causal Discovery models, an area where quality ground-truth datasets are hard to come by- making data augmentation a valuable technique.