š I'm a computer science student at the University of Waterloo
š Iām interested in databases, graphics, public transit, and maps
š« How to reach me: email h99nguye at uwaterloo.ca
Network visualization of the web - Scraped 50,000 blogs from Resonant.live and displayed them as a graph. There are clusters of sites that all link closely together, with topics like rationality, tech, crypto, Canada, and even postgres!
Built with WebGL, Typescript, and lots of ChatGPT š
Some interesting technical challenges I ran into:
Front page HackerNews [55 comments]: https://news.ycombinator.com/item?id=40136208
Interactive travel time map - A map that shows you how far you can go using public transit. Routing done using Rust on GTFS data, and rendering done using Mapbox GL vector tiles
Resonant.live - A feed and search engine for high quality articles, built with Freeman Jiang.
We built a robust scraping infrastructure to scrape over 500k blogs from 40k different domains. We used embeddings to efficiently search through each article and recommend similar articles for users.
3d renderer - demo compiled to WASM An image buffer, triangle rasterizer, 3d renderer from scratch. Includes shadows, highlights, ambient lighting, and various performance optimizations. Compiled to WASM.
Transit map - (longer video here) - Map visualization of 1000+ Toronto buses travelling in real-time. Uses WebGL and some performance optimizations to do 60FPS rendering of that many buses. For a better video than the GIF, I have a Youtube video
Full-text search engine - A full-text search engine (like Elasticsearch). Demo includes near-instant textual search of all English Wikibooks articles. Uses db1 as the underlying storage engine
Custom DNS Nameserver - Authorative DNS nameserver (based off RFC 1034/1035) from scratch.
Coroutines and garbage collector in C -
Userspace context switching, multitasking without threads, or stackful, preemptive multitasking. Implemented with some small inline assembly and register manipulation. Made a fair task scheduler based off the Linux CFS scheduling algorithm.
Mark-and-sweep garbage collector in C. Implemented using pycparser
to modify C code, and custom stack frames to find GC roots.
JSON database - A fully transactional, ACID database that can store infinite depth JSON objects. Primitive support for replicated transactions using gRPC.
db1 - A second iteration attempt to make a database. A simpler key-value database with static schema, fixed-size tuples, and columnar compression. Built for use by my search engine
Automatic ATC with A* pathfinding - Routes incoming airplanes to one of two runways while avoiding collisions. Done for a coding assignment.
Beeeeeep - Experiments with audio processing and modulation to send and receive binary data over sound using my laptop speakers/microphone at 1kbps.
Bridge Static Analysis - Solves forces on bridges and uses random search to minimize cost of bridge.
Interactive ontology visualization - Interactive knowledge graph visualization built using Cytoscape.JS and Python backend. Generated Docker images for easy deployment.
Othello game with AI - An othello game made for my university class. Implemented a minimax AI algorithm with culling.
Transify - I forked the open source routing engine Graphhopper (Java) for more accurate isochrones on geometries with sparse intersections and batch shortest-path-tree calculations. Decreased runtime of analyzing Brampton transit accessibility from 32 seconds to 2 seconds. This feature is used by transit planners to determine which areas a bus route serves and calculate demographics such as how many jobs or homes are on the route.
Transify - I created a realtime map of bus vehicle locations using Node.JS and React, with automated builds and deployments to GKE. This is useful for transit agencies to see at a glance which buses are delayed and need help along a route.
Dropbase - I built a Pandas-compatible Excel parser optimized for chunked, larger-than-memory reading based off the open source library Calamine. I then built a Python wrapper around the native Rust library, setup automated Github Actions builds, and did extensive testing to make sure the new parser was compliant with Pandas.
Dropbase - When profiling, I found that dateutil
was taking up 99% of our task runtimes. This is a very common problem. Since Dropbase tries to accept all CSVs and tries to parse them into a structured format, I created an optimize datetime parser for Rust that handles all types of formats with/without timezones.
If you're interested, my resume is available and LinkedIn and Github. I'd love to chat about anything!