The big idea: You've spent the year learning how to read and understand data. Now you're going to use AI as a coding partner to build an interactive data visualization that goes beyond what pencil and paper can do — one that lets a viewer explore your data, not just look at it. You'll pick a real dataset, ask a question worth answering, and let the data tell its story.
Schedule How These Days Break Down
Phase 1 + 2
Data & First Build
Phase 3
Go Beyond
Present
Monday 5/11 → submit baseline visual
Thursday 5/14 → submit final + transcripts
Friday 5/15
Phase 1 Find Your Dataset

You'll find your dataset at openintro.org/data. Every dataset there is clean, well-documented, and comes with a description of every variable. Browse until something catches your interest.

💡 What to look for: Your data should include at least three variables (more is better!), ideally a mix of categorical and quantitative. Samples should also be large enough to meet the conditions for familiar inference procedures. Each dataset page includes a description, variable list, and a direct CSV download link — use that CSV file in Phase 2.

Explore the list of data sets before picking something — find something you're actually curious about!

Before Moving On, Answer These Together

These are the questions a statistician asks before touching any tool. Answer them in writing — you'll need them for your presentation.

  • Who are the individuals? Each row in your dataset represents one _____. Be specific — not just "people," but "adult borrowers who applied for a loan through LendingClub."
  • What is the population? Describe the full group your dataset is meant to represent. Who or what are you ultimately trying to understand?
  • Sample or population? Is your dataset the entire population, or a sample drawn from it? If it's a sample, was it collected in a way that makes it representative?
  • What are the variables? List each variable, note whether it's categorical (C) or quantitative (Q), and write one sentence about what it measures. The OpenIntro dataset page lists these — read them carefully.
  • What question do you want to answer? Write one specific question your visualization will help address. Not "what does the data show?" — something like "Do loan approval rates differ by home ownership status, and does income explain that gap?"
Phase 2 Build Your Baseline Visualization

Open an LLM of your choice (Large Language Model such as ChatGPT, Gemini, or Claude) and start a new conversation. Note: the minimum age requirement for Gemini and ChatGPT is 13+ while Claude is 18+. Copy the prompt below, upload your CSV, and let the LLM ask you the clarifying questions before it builds anything.

🛠️ Before pasting the prompt, activate the live preview tool in your LLM — this lets you see the visualization as it's built. In ChatGPT or Gemini, look for Canvas in the toolbar. In Claude, look for Artifacts. Turn it on first, then paste the prompt.
📋 Paste this prompt into the LLM to start
You are a web-design expert and co-designer of a data visualization tool that will be submitted as part of a final project in AP Statistics. I can provide statistical insights and questions, and you will provide the code to build a dynamic, interactive visual exploration of a data set. Start by asking me to upload my dataset. Once I've shared it, continue with the following questions. Ask each question one at a time, and ask followup questions if my answers are not clear: 1. Is this sample or population data? Who or what is the observational unit? 2. Which 2–3 variables am I most interested in exploring? 3. Which AP Statistics chart type do I want as my baseline? (bar chart, mosaic plot, side-by-side bar graph, scatterplot, histogram, dot plot, or stem-and-leaf plot) 4. What question do I want this visualization to help answer? After I answer, build the baseline chart as a clean, well-labeled, interactive HTML file. Make it look polished — good fonts, thoughtful colors, and clear axis labels. Show me the result and wait for my feedback before adding anything else.

Your Baseline Must Include At Least One of These

Bar Chart / Side-by-Side Bars
Comparing counts or proportions across categories
Mosaic Plot
Two categorical variables; area = count
Histogram
Distribution of one quantitative variable
Scatterplot
Relationship between two quantitative variables
Dot Plot
Small dataset; individual values visible
Stem-and-Leaf Plot
Small dataset; shows shape and individual values
⚠️ Before you submit: AI can make mistakes, and you should check outputs for accuracy. Does the chart type match your data? Are axes labeled with units? Does the title describe what the chart actually shows? These are the same things your teacher checks on a free-response.

Submitting Your Baseline

Save your file as teammate1_teammate2.html using both of your last names (e.g., chen_morgan.html). Download it from the LLM's code panel and submit it to Google Classroom by the end of Monday's class. This is your checkpoint — it doesn't need to be finished, but it needs to run correctly in a browser.

Also submit your chat transcript from today's session. Export or copy it as a PDF or shared link and attach it to the same Classroom submission. The transcript is evidence that you were directing the work — it should show you asking questions, pushing back, and making decisions, not just accepting whatever the LLM produced first.

Phase 3 Go Beyond Paper

This is the part that makes your visualization worth presenting. Your job on Thursday is to add at least one interactive layer that paper and pencil simply cannot do — something that lets a viewer explore your data rather than just look at a picture of it.

Before you ask the LLM to build anything, decide with your group what you want and why. The feature should serve the questions you wrote down in Phase 1, not just look impressive. Be ready to explain that connection on Friday.

Some Directions to Consider

Drill-down on hover
Hovering a bar or tile reveals a breakdown by a third variable inside it
Filter / toggle
Dropdowns or checkboxes let the viewer slice the data and watch the chart update
Tooltip storytelling
Hovering a data point shows rich context — not just the value, but what it means
Animated comparison
A play button steps through time or ordered categories and animates the chart
Linked views
Two charts side-by-side that highlight each other when you click
Your own idea
If you have something specific in mind, describe it to the LLM and see what's possible

Submitting Your Final File

By the start of Thursday's class, submit your updated teammate1_teammate2.html to Google Classroom. This is the version that will be hosted in the class gallery at mathclass.today — make sure it opens correctly in a browser with no internet connection required.

Also submit your Thursday chat transcript alongside the file. Together, your Monday and Thursday transcripts are the paper trail for your decisions. If you're asked on Friday why you chose a particular chart type, interaction, or color scheme, your transcripts should back up your answer — they should show you proposing ideas, evaluating options, and steering the tool, not just running the LLM's first suggestion.

📌 If something breaks while iterating, tell the LLM exactly what stopped working and ask it to fix only that part. If the code becomes hard to follow, ask the LLM to rewrite the full file cleanly while keeping all existing features. Save a working copy before making big changes.
Present Friday Presentation · 5 Minutes

Presentations are on Friday, May 15. Each pair has exactly 5 minutes. You'll open your HTML file in the browser and walk the class through your data story. You are not summarizing your process — you are making an argument about what your data shows.

💭 Before you present: Your presentation needs to demonstrate thinking that is yours, not the AI's. Be ready to explain — without hesitation — why you made the choices you did. Expect questions about any of it.

Your 5 Minutes Should Cover

  • The dataset: Who are the individuals? What is the population? Is this a sample or a census?
  • Your question: The specific statistical question you wrote down in Phase 1. Say it out loud.
  • The baseline chart: Walk us through it the way you would on a free-response — shape, center, spread, or association. What does the standard view show?
  • The interactive layer: Demo it live. Then explain: Why did you add this feature? What does it reveal that the static chart couldn't?
  • Your inference: Run one significance test or construct one confidence interval that directly addresses your question from Phase 1. State your inference methods and hypotheses (if applicable), check conditions briefly, report your result, and interpret it in context. You may embed the result in your visualization, but it must be addressed in your presentation.
  • Your finding: What is the one thing you want us to walk away knowing? Connect the visual and the inference.

How You'll Be Assessed

What We're Looking For What It Means Pts
Statistical accuracy Chart type matches data type; labels and scales are correct; sample/population distinction is stated clearly 2
Interactive layer + decision Goes meaningfully beyond a paper chart; you can explain why you added it and what it reveals 3
Inference Correct procedure chosen; conditions addressed; result interpreted in context of your question 3
Data story + finding Visual and inference connect to a clear, specific claim about your data 1
Polish Fonts are readable, axis labels are clear and correctly sized, titles are accurate, no typographical errors — the visualization looks finished 1
Presentation 5-minute limit respected; both partners speak; demo is live and working 1
Chat transcripts All transcripts submitted; they show you directing the AI — proposing ideas, making decisions, and pushing back — not just accepting the first output 1
📌 Your finished file will be hosted in the class gallery at mathclass.today — no laptop needed on Friday.