Building AI Agents: JSON Cheat Sheet

If you’re building an AI agent or analyzing data in Python, your JSON needs go a bit deeper than just saving files. You’re probably parsing API responses, sending requests with payloads, handling nested JSON, maybe even streaming or validating data.

Look at this AI/data analyst JSON cheat sheet (Python edition)

Load API response into Python dict

python

import json
import requests

response = requests.get("https://api.example.com/data")
data = response.json()  # already parsed to Python dict

If you get raw JSON text instead:

python

data = json.loads(response.text)

Extract nested values (basic & smart ways)

Basic:

python

value = data["person"]["location"]["city"]

Safe version:

python

value = data.get("person", {}).get("location", {}).get("city")

With jsonpath-ng (query-style access):

bash

pip install jsonpath-ng

python

from jsonpath_ng import parse

expr = parse("$.person.location.city")
matches = [match.value for match in expr.find(data)]

Handle dynamic or unknown JSON structure

python

def walk_json(data):
    if isinstance(data, dict):
        for key, value in data.items():
            walk_json(value)
    elif isinstance(data, list):
        for item in data:
            walk_json(item)
    else:
        print(data)

Send JSON in an API request (POST)

python

payload = {
    "input": "Write a poem about tacos",
    "model": "gpt-3.5-turbo",
    "temperature": 0.7
}

response = requests.post("https://api.openai.com/v1/chat/completions",
                         headers={"Authorization": "Bearer YOUR_KEY"},
                         json=payload)

Stream JSON responses (large LLM output or logs)

Example using `response.iter_lines()`:

python

with requests.post(url, headers=headers, json=payload, stream=True) as r:
    for line in r.iter_lines():
        if line:
            json_line = json.loads(line.decode("utf-8"))
            print(json_line)

Flatten nested JSON (for DataFrame)

python

from pandas import json_normalize

# Nested JSON example:
data = {
    "user": {
        "id": 1,
        "name": "Beyoncé",
        "location": {"city": "Houston", "state": "TX"}
    }
}

flat = json_normalize(data)

Works with lists of records too.

Convert DataFrame → JSON (for saving or API use)

python

import pandas as pd

df = pd.DataFrame([
    {"name": "Beyoncé", "age": 42},
    {"name": "Rihanna", "age": 36}
])

# Convert to JSON string
json_string = df.to_json(orient="records", indent=2)

Validate if a string is valid JSON

python

def is_json(string):
    try:
        json.loads(string)
        return True
    except ValueError:
        return False

Save model outputs (LLM, predictions, etc.) to JSON

python

outputs = [{"prompt": "hi", "response": "hey there"}]

with open("outputs.json", "w") as f:
    json.dump(outputs, f, indent=2)

Load logs/configs for your agent

python

with open("config.json") as f:
    config = json.load(f)

agent_name = config.get("agent_name", "DefaultAgent")

Handle Decimal or datetime

python

from decimal import Decimal
from datetime import datetime

data = {
    "score": Decimal("92.4"),
    "timestamp": datetime.now()
}

def custom(obj):
    if isinstance(obj, Decimal):
        return float(obj)
    if isinstance(obj, datetime):
        return obj.isoformat()
    raise TypeError("Unhandled type")

json.dumps(data, default=custom)

Parse JSON from a file full of JSON objects (LLM logs)

python

with open("logs.jsonl") as f:
    for line in f:
        data = json.loads(line)
        print(data["output"])

Advanced query/validation (with jsonschema)

bash

pip install jsonschema

python

from jsonschema import validate, ValidationError

schema = {
    "type": "object",
    "properties": {
        "name": {"type": "string"},
        "age": {"type": "number"}
    },
    "required": ["name"]
}

data = {"name": "Beyoncé", "age": 42}

try:
    validate(data, schema)
except ValidationError as e:
    print("Invalid:", e)

First, quick mindset shift: when you’re working with JSON as an AI agent builder, you’re not just reading and writing files. You’re…

Talking to APIs (input/output = JSON)
Logging predictions
Saving user settings
Parsing chat history
Streaming data
Building dynamic prompts or chains
Extracting or validating structured responses

Let’s unpack that real quick, uh?

1. Talking to APIs (input/output = JSON)

Whenever your agent hits the OpenAI API (or any AI/ML API), you’re sending and receiving JSON.
Example:

json

{
  "model": "gpt-4",
  "messages": [{"role": "user", "content": "Summarize this..."}]
}

You’ll either handcraft these payloads or auto-build them from user inputs.

2. Logging predictions

Anytime your agent gives a response, you’ll want to save what was input, what came out, what the timestamp was, maybe confidence scores, and which model/version was used.

Why?

Reproducibility
Error fixing
Comparing responses from different agents

You log this in JSON, often one object per run.

3. Saving user settings

You’ll start letting users adjust things like:

Preferred language
Response length
Formal vs casual tone
“Always summarize, never explain”

Instead of using a database at first, you’ll just use a .json file per user.
Simple. Works offline. Easy to edit.

4. Parsing chat history

Let’s say your agent needs context from earlier in the convo—like tools, names, or a goal you set earlier.

You’ll need to track the chat, usually as a list of JSON message objects.

json

[
  {"role": "user", "content": "Call me Elle."},
  {"role": "assistant", "content": "Hi Elle!"}
]

Then you feed this back into GPT for memory/continuity.

5. Streaming data

Sometimes you want to stream a response token-by-token, especially for long outputs.

JSON comes in when:

You process partial JSON chunks
You make sure streaming output is still parsable (openAI gives partial JSON sometimes)
You stitch these chunks into a full object on your end

6. Building dynamic prompts or chains

Let’s say your agent has 3 steps: extract question ➡️ fetch context ➡️ answer it.

You’ll store “prompt templates” in JSON and dynamically fill in blanks.
This way you don’t hardcode text into your Python.

json

{
  "step1": "Identify key question in: {{text}}",
  "step2": "Find relevant data for: {{question}}",
  "step3": "Answer based on context: {{context}}"
}

7. Extracting or validating structured responses

Let’s say GPT gives back:

json

{"action": "summarize", "target": "last paragraph", "language": "en"}

You’ll validate:

Did it return real JSON?
Does it have the keys you expected?
Is “action” one of your allowed operations?

This prevents your code from crashing when GPT gets weird.

Wanna squeeze more out of JSON in Python—starting with simplified examples, then ramp up into a focused mini-lab kit for OpenAI agents?

Great!

A. Store & Read Chat Memory (Agent context history)

python

chat_history = [
    {"role": "user", "content": "Summarize this article"},
    {"role": "assistant", "content": "Sure. Here’s a summary..."}
]

with open("chat_log.json", "w") as f:
    json.dump(chat_history, f, indent=2)

# Later, reload it
with open("chat_log.json") as f:
    history = json.load(f)

Use: Your agent can pick up where it left off.

B. Merge JSON into prompts dynamically

python

user_info = {
    "name": "Tate McRae",
    "style": "casual",
    "topic": "how JSON works"
}

prompt = f"Write a blog post for {user_info['name']} in a {user_info['style']} voice about {user_info['topic']}."

Use: Compose smarter, personalized prompts using structured data.

C. Extract exact info from LLM JSON output

Let’s say your model returns a structured response like:

json

{"summary": "AI helps with surgery", "confidence": 0.93}

python

response = json.loads(model_output)
print(response["summary"])

Use: Don’t just print the raw response—grab the piece you need.

D. Keep multiple logs organized by agent run

python

log = {"run_id": "a1b2c3", "inputs": "Find trends", "outputs": "Here are 3 trends..."}
filename = f"runs/{log['run_id']}.json"

with open(filename, "w") as f:
    json.dump(log, f, indent=2)

Use: One JSON file per run. Clean. Searchable. Reproducible.

E. Use JSON to define test cases for your agent

python

test_cases = [
    {"input": "What's 2+2?", "expected": "4"},
    {"input": "Who is Carlie Hanson?", "expected": "A pop singer"}
]

for case in test_cases:
    output = your_agent(case["input"])
    if output != case["expected"]:
        print(f"FAIL: {case['input']}")

Use: Turn test prompts into JSON, keep test logic separate from code.

F. Batch model inputs using JSONL

json

{"input": "Summarize this..."}
{"input": "What’s next in AI?"}
{"input": "Explain JSON in plain English"}

Each line = 1 JSON object.

python

with open("batch.jsonl") as f:
    for line in f:
        task = json.loads(line)
        print("Processing:", task["input"])

Use: Scalable bulk processing. Streamlined agent input.

G. Use JSON to inject tools or agent settings

python

tools = {
    "tools": ["summarizer", "translator", "sentiment_analyzer"],
    "defaults": {
        "language": "en",
        "max_tokens": 1000
    }
}

agent_config = json.dumps(tools, indent=2)

Use: Your agent reads this as part of its setup or self-instruction.

H. Trick: JSON within JSON for nested agents

You can save the actual full prompt or response history as a string inside a JSON:

python

task = {
    "name": "long_summary",
    "raw_prompt": json.dumps(chat_history),
    "result": "It’s a summary of five pages..."
}

Use: When nesting structured agent flows.

👀 Part 2: Stuff no one warns you about but you’ll 100% be doing

Pre-validating the prompt as a JSON-safe string

Before you stick something into a JSON payload, you’ll:

Escape quotes
Strip bad characters
Truncate long strings
Catch encoding bugs

Otherwise, OpenAI rejects your request or gives bad output.

Auto-cleaning malformed JSON from LLM output

Sometimes GPT says it’ll return valid JSON… then doesn’t.
You’ll need little auto-fixers like:

Add missing closing brackets
Replace single quotes with double quotes
Strip out rogue \n\n before {

This is the dirty job that saves your app.

Building JSON from scratch, line-by-line

When debugging or prototyping, you’ll literally build JSON yourself:

pythonCopyEditout = {"step": 1}
if user_input:
    out["user_said"] = user_input
if summary:
    out["summary"] = summary

Way safer than dumping a whole LLM response into one json.loads() and praying.

Creating your own mini-specs

Eventually, you’ll invent your own rules for agent outputs:

jsonCopyEdit{
  "action": "summarize",
  "priority": "high",
  "language": "en",
  "style": "playful"
}

You’ll document this in a comment or README so other devs know how to format things for your agent. It’s like API design but micro.

Compressing big JSONs to keep under token limits

You’ll often be like, “Damn, this whole chain + memory + context is too long.”

So you’ll:

Remove unnecessary keys
Truncate text fields
Convert verbose JSON to compact mode

Useful when passing context into prompts.

Creating fallback values for missing JSON fields

Agent asks “What’s their preferred tone?”
GPT forgot to include it in the JSON? You’ll use:

pythonCopyEdittone = parsed.get("tone", "neutral")

This is one of those boring-but-vital things that keeps agents from breaking.

Parsing GPT output that has JSON inside a string

Sometimes GPT returns this:

jsonCopyEdit{"result": "{\"topic\": \"race cars\", \"angle\": \"for beginners\"}"}

So you’ll do:

pythonCopyEditouter = json.loads(raw)
inner = json.loads(outer["result"])

That double-layer is sneakily common with nested agents.

Storing tiny caches or lookups

Don’t want to call GPT for the same thing twice?

Store previous results in a quick JSON cache:

jsonCopyEdit{"summarize:carbon emissions": "Carbon emissions are..."}

Saves cost. Saves time. Keeps your agent sharp.

Serializing non-serializable stuff

You’ll eventually want to save objects like datetime, Path, etc.

Python’s json doesn’t like that.

You’ll write helpers like:

pythonCopyEditdef fix(obj):
    if isinstance(obj, datetime):
        return obj.isoformat()
    return str(obj)

And call: json.dump(obj, default=fix)

Auto-generating JSON schemas to validate GPT output

Once you know the shape of your responses, use tools like pydantic or jsonschema to validate them.

This protects your code like a bouncer at the door.
GPT can’t get in unless it fits the rules.

Injecting tiny functions into JSON chains

Yes, GPT can return JSON with an “action” key… but then you’ll trigger Python functions based on it.

pythonCopyEditif response["action"] == "summarize":
    return summarize(response["text"])

You just turned a JSON blob into agent-level behavior.

Spread the love

Building AI Agents: JSON Cheat Sheet

Example using `response.iter_lines()`:

1. Talking to APIs (input/output = JSON)

2. Logging predictions

3. Saving user settings

4. Parsing chat history

5. Streaming data

6. Building dynamic prompts or chains

7. Extracting or validating structured responses

👀 Part 2: Stuff no one warns you about but you’ll 100% be doing

Storage Baskets: Is There Such a Thing as Too Many?

9 fun ideas for a girly girl to mix woven baskets with other textures in her one-bedroom apartment for a balanced look.

How To Make Your Bedroom Grunge

Example using response.iter_lines():

1. Talking to APIs (input/output = JSON)

2. Logging predictions

3. Saving user settings

4. Parsing chat history

5. Streaming data

6. Building dynamic prompts or chains

7. Extracting or validating structured responses

👀 Part 2: Stuff no one warns you about but you’ll 100% be doing

You'll also like...

Lesson 1: Python is Your Pet Snake; It Does Whatever You Tell It

🐍 PYTHON 101: “Quick & Easy Vocab”

JSON (JavaScript Object Notation) is not a programming language.

Storage Baskets: Is There Such a Thing as Too Many?

9 fun ideas for a girly girl to mix woven baskets with other textures in her one-bedroom apartment for a balanced look.

How To Make Your Bedroom Grunge

Example using `response.iter_lines()`: