← home

academic courses

existing courses

applied big data analytics

Everyone wants to plug an LLM into their app, but few know where its training data came from—or how to retrain it responsibly. Everyone wants a "magic" ML model, yet most struggle to transform terabytes of raw logs into usable features. Knowledge graphs are on every roadmap, but almost nobody knows how to model or populate them at scale. Everyone craves real-time dashboards, but wiring stream processors that scale globally is non-trivial. And when it comes to securing data pipelines or turning notebooks into production workflows with repeatable MLOps, even seasoned engineers are guessing.

Applied Big Data Analytics closes those gaps. Each week we start with a real-world scenario—fraud detection at a fintech, content ranking for a social platform, predictive maintenance in energy—then build an end-to-end solution using the right mix of distributed systems, probabilistic data structures, graph technology, real-time stream processing, governance controls, and production-grade MLOps. By the end of the course you will have designed, deployed, and stress-tested industry-grade data products: from data lakes that feed knowledge graphs, to LLM fine-tuning pipelines, to secure streaming inference services. You'll leave not just knowing the buzzwords, but having built the systems behind them.

lectures

  1. Administrivia
  2. Analyzing Petascale Financial Data · analytics · Hadoop/Spark
  3. Storing Petascale Financial Data · warehouses, lakes & meshes
  4. Indexing, Searching & Managing Social Media Data · HLLs, inverted indexes
  5. Detecting Fraud with Connected Data · graph analytics
  6. Making an LLM Smarter · knowledge graphs & GraphRAG
  7. Segmenting E-commerce Users · clustering & dimensionality reduction
  8. Predicting Customer Churn · feature engineering & hyper-parameter tuning
  9. Estimating Real Estate Prices · model drift, MLOps
  10. Forecasting Industrial Machine Failures · time-series · neural nets
  11. Democratising Healthcare Analytics · meta-learning & distributed AutoML
  12. Analyzing an Infinite IoT Sensor Datastream · real-time stream processing
  13. Interpreting World Development Indicators · explainable AI
  14. Analyzing Data in a Chaotic & Unsecure World · privacy-preserving ML

projects

  1. Loan Performance Data Insights
  2. SEC Corporate Filings Insights
  3. NYC Taxi Trip Insights
  4. NYISO Electricity Consumption & Pricing Insights

advanced real-world programming

We have come a long way since the time when it was generally believed that computers themselves would never constitute a scientific field of study, to a time where Computer Science has pervaded and made an impact on nearly all aspects of human life. The "smartness" of computers has enabled us to explore the surface of mars, map the human genome, discover malignant cancer cells, and even discover the "God particle", ad infinitum. At the same time, it is important to remember that a computer by itself is a "dumb" device which needs to be "programmed" to perform actions and tasks.

With all of this in mind, this course has been designed to teach students "to think like a computer scientist". Through a general-purpose programming language, Python, students will be introduced to multiple programming paradigms, including functional, imperative, and object-oriented, with a strong emphasis on writing clean, efficient (both in space and time), and bug-free code. Taking a hands-on approach, we will discuss and analyze how all of these methodologies can be mapped onto real-world problems and algorithms and students will be expected to get their hands dirty. Importantly, rather than getting into the nitty-gritty of these paradigms, this course aims to equip the students with the right skills and tools to venture out into the real world and make an impact in their respective domains.

"The computing scientist's main challenge is not to get confused by the complexities of his own making."
— E. W. Dijkstra

Course Material

courses on the anvil

vibe coding

Turn plain-language ideas into production-ready software. Dive into sharp prompt tricks, model fine-tuning, and seamless DevOps flows that let AI generate, refactor, and harden sprawling codebases. Stress-test everything with rigorous suites, lock down security, and squeeze out every millisecond of performance—so the code your model writes is truly enterprise-grade. Walk away ready to vibe-code complex systems from concept to deploy.

workshops

vibe coding: from thought to construction

There are about 8 billion people in the world.
Roughly 30 million can write software.
More than 1.5 billion can speak English.

For decades, the digital economy has been gated by the second number. Vibe coding asks what happens when the third gets to build too.

what this workshop is about

Vibe coding is a way of building software by describing intent, setting constraints, and iterating through conversation. Instead of translating ideas into code first, it allows ideas to become executable early and to be shaped through use rather than speculation.

This session treats language not as documentation, but as a construction medium.

Workshop Slides