PlotMyData

PlotMyData is an agentic data analysis and visualization system. It follows your prompts to drive an R session.

You can start with example datasets, upload your own data, or download data from a URL. If you want to ask about the data or transform it before plotting, just say what you want to do.

Features

Multiple data sources: Use built-in R datasets or user-provided data (currently CSV files are supported)
Interactive analysis: The system uses an R session so variables persist across invocations
Instant visualization: Plots are shown in the chat interface and are downloadable as PNG files

Agents and tools refined through many usage trials

Help tools
- Provide access to help pages for packages and topics
Data agent
- Knows about R datasets and can access uploaded files or URLs
- Data files are automatically summarized for the LLM
- This lets you describe a plot without knowing the exact variable names
Run agent
- Runs R code generated by the LLM
- If you want to run specific code, just send it in a message
- LLM chooses invisible or visible results depending on requirements
Plot agent
- Tools are provided for making plots with base R graphics (default) and ggplot2
- To use ggplot2, just mention "ggplot" or "ggplot2" in your message
Install agent
- Installs CRAN packages to add capabilities to the running application
- Can be called by other agents or requested by the user
- User confirmation is required for installing any packages

Running the application

The application can be run with or without a container.

Containerless

Install R and run install.packages(c("ellmer", "mcptools", "readr", "ggplot2", "tidyverse"))
Install Python with packages listed in requirements.txt
Put your OpenAI API key in a file named secret.openai-api-key
Execute run_web.sh to start an R session and launch the ADK web UI

Containerized

First, build the project. This creates a plotmydata Docker Compose project and a plotmydata-app image.

docker compose build

Now run the project. This uses your OpenAI API key (sk-proj-...) from secret.openai-api-key.

docker compose up

Changing the model

If you want to change the remote LLM from the default (gpt-4o), change it in the startup script (run_web.sh or entrypoint.sh).

To use a local LLM, install Docker Model Runner then run this command.

docker compose -f compose.yaml -f model-runner.yaml up

See model-runner.yaml to change the local LLM used.

Examples

Plot data

Plot radius_worst (y) vs radius_mean (x) from https://github.com/jedick/plotmydata/raw/refs/heads/main/evals/data/breast-cancer.csv. Add a blue 1:1 line and title "Breast Cancer Wisconsin (Diagnostic)".

Plot functions

Plot a Sierpiński Triangle

Chat session with AI agent to plot a Sierpiński Triangle

Interactive analysis

Save 100 random numbers from a normal distribution in x
Run y = x^2
Plot a histogram of y

Evaluations

Most recent eval run: 74% accuracy on 50 cases with GPT-4o.

Evals history

Accuracy = fraction of correct plots. Plot correctness is judged by a human.

Eval set	Size	Agent version	Accuracy	Notes
04	50	1c3f5bd	0.74	More base graphics and add Install agent: corrr, scatterplot3d, nlme, parcoord, kde, and custom plots
03	40	24fb91f	0.75	Model: gpt-4o
03	40	b8e5f8c	0.38	Add agent for loading and summarizing data
03	40	30c22a1	0.50	Handle uploaded CSV files
02	37	e9180aa	0.49	More base graphics: hist, image, lines, matplot, mosaicplot, pairs, rug, spineplot, plot.window
01	27	e9180aa	0.52	Add help tools to get R documentation
01	27	bb4eead	0.41	Mainly base graphics: barplot, boxplot, cdplot, coplot, contour, dotchart, filled.contour, grid (Model: gpt-4o-mini)

Evals info

The repo tracks both evaluation sets and prompt sets. For example, the evals/01 directory contains all results for the first evaluation set using different prompt sets. The file name uses the short commit hash for the prompt set used for evaluation.

Each eval consists of a query and reference code and image. Because of their size, reference and generated images are not stored in this repo.

To run evals, copy the latest eval CSV file to evals/evals.csv. Then use e.g. run_eval.sh 1 to run the first eval. This script: 1) saves the tool calls, generated code, and current date to the CSV file and 2) saves the generated image to the evals/generated directory.

After running evals, change to the evals directory and run streamlit run view.py to edit the eval CSV file. This app allows:

Choosing an eval to edit
Viewing the reference and generated images side-by-side
Indicating whether the generated plot is correct (True or False)
Editing other eval data (e.g. query, file name for data upload, reference code, notes)
Adding new evals

Architecture

An Agent Development Kit client is connected to an MCP server from the mcptools R package
The startup scripts launch a persistent R session with some preloaded packages and helper functions
Data files are saved in a temporary directory using ADK's artifacts and callbacks
- This is how the R session can access the files

Container notes:

The Docker image is based on rocker/r-ver and adds R packages and a Python installation
Docker Compose is used for port mapping, secrets, and watching file changes with Docker Watch

Licenses

This code in repo is licensed under MIT
Some examples used in evals are taken from R and are licensed under GPL-2|GPL-3
breast-cancer.csv (from UCI Machine Learning Repository via Kaggle) is licensed under CC BY 4.0

Name		Name	Last commit message	Last commit date
Latest commit History 67 Commits
PlotMyData		PlotMyData
evals		evals
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
compose.yaml		compose.yaml
entrypoint.sh		entrypoint.sh
functions.R		functions.R
model-runner.yaml		model-runner.yaml
profile.R		profile.R
prompts.R		prompts.R
prompts.py		prompts.py
requirements.txt		requirements.txt
run_eval.py		run_eval.py
run_eval.sh		run_eval.sh
run_web.sh		run_web.sh
server.R		server.R

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

PlotMyData

Features

Agents and tools refined through many usage trials

Running the application

Examples

Evaluations

Architecture

Licenses

About

Uh oh!

Releases

Packages

Languages

License

jedick/plotmydata

Folders and files

Latest commit

History

Repository files navigation

PlotMyData

Features

Agents and tools refined through many usage trials

Running the application

Examples

Evaluations

Architecture

Licenses

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages