← home

existing courses

applied big data analytics

Everyone wants to plug an LLM into their app, but few know where its training data came from—or how to retrain it responsibly. Everyone wants a "magic" ML model, yet most struggle to transform terabytes of raw logs into usable features. Knowledge graphs are on every roadmap, but almost nobody knows how to model or populate them at scale. Everyone craves real-time dashboards, but wiring stream processors that scale globally is non-trivial. And when it comes to securing data pipelines or turning notebooks into production workflows with repeatable MLOps, even seasoned engineers are guessing.

Applied Big Data Analytics closes those gaps. Each week we start with a real-world scenario—fraud detection at a fintech, content ranking for a social platform, predictive maintenance in energy—then build an end-to-end solution using the right mix of distributed systems, probabilistic data structures, graph technology, real-time stream processing, governance controls, and production-grade MLOps. By the end of the course you will have designed, deployed, and stress-tested industry-grade data products: from data lakes that feed knowledge graphs, to LLM fine-tuning pipelines, to secure streaming inference services. You'll leave not just knowing the buzzwords, but having built the systems behind them.

lectures

  1. Administrivia
  2. Analyzing Petascale Financial Data · analytics · Hadoop/Spark
  3. Storing Petascale Financial Data · warehouses, lakes & meshes
  4. Indexing, Searching & Managing Social Media Data · HLLs, inverted indexes
  5. Detecting Fraud with Connected Data · graph analytics
  6. Making an LLM Smarter · knowledge graphs & GraphRAG
  7. Segmenting E-commerce Users · clustering & dimensionality reduction
  8. Predicting Customer Churn · feature engineering & hyper-parameter tuning
  9. Estimating Real Estate Prices · model drift, MLOps
  10. Forecasting Industrial Machine Failures · time-series · neural nets
  11. Democratising Healthcare Analytics · meta-learning & distributed AutoML
  12. Analyzing an Infinite IoT Sensor Datastream · real-time stream processing
  13. Interpreting World Development Indicators · explainable AI
  14. Analyzing Data in a Chaotic & Unsecure World · privacy-preserving ML

projects

  1. Loan Performance Data Insights
  2. SEC Corporate Filings Insights
  3. NYC Taxi Trip Insights
  4. NYISO Electricity Consumption & Pricing Insights

courses on the anvil

vibe coding

Turn plain-language ideas into production-ready software. Dive into sharp prompt tricks, model fine-tuning, and seamless DevOps flows that let AI generate, refactor, and harden sprawling codebases. Stress-test everything with rigorous suites, lock down security, and squeeze out every millisecond of performance—so the code your model writes is truly enterprise-grade. Walk away ready to vibe-code complex systems from concept to deploy.

ai is god: philosophy and ethics in the age of ai

Is reality a simulation—and if so, who's in charge? This course tackles AI's toughest thought-experiments: the Simulation Argument, Roko's Basilisk, mind uploading, the trolley problem, Fermi's Paradox, and the clash between accelerationism and altruism. We'll test whether machines deserve moral or legal status, how AI warps knowledge and creativity, and what these questions reveal about human identity.