All posts

May 6, 2026

How to Backtest Pump.fun Trading Strategies with Python

Pump.fun tokens move fast. Most go to zero. A few do 100x. If you want to trade them systematically, you need historical data and a way to test your ideas before risking real capital. In this post we will download Pump.fun swap data, reconstruct token prices from on-chain reserves, and backtest a simple momentum strategy in Python.

What you need

pip install pandas pyarrow requests

Download historical swap data

PumpFunData serves hourly Parquet files for two exchanges. pump_fun covers the bonding curve and pump_amm covers the graduated AMM. Each file contains every swap, token creation, and liquidity event for that hour.

Start by checking what date range is available.

import requests
API_KEY = "pfd_your_key_here"
BASE = "https://api.pumpfundata.com"
# Check available dates
resp = requests.get(f"{BASE}/range", params={"exchange": "pump_fun"}, headers={"X-API-Key": API_KEY})
print(resp.json())
# {"exchange": "pump_fun", "start": "...", "end": "...", "files": ...}

Now download a full day of data. That means 24 hourly files, and each one costs 1 credit.

import os
DATE = "2026-04-15"
OUT_DIR = f"data/{DATE}"
os.makedirs(OUT_DIR, exist_ok=True)
for hour in range(24):
resp = requests.get(
f"{BASE}/download",
params={"exchange": "pump_fun", "date": DATE, "hour": str(hour)},
headers={"X-API-Key": API_KEY},
)
if resp.status_code == 200:
path = f"{OUT_DIR}/pump_fun_{DATE}_{hour:02d}.parquet"
with open(path, "wb") as f:
f.write(resp.content)
print(f"Downloaded hour {hour:02d}")
else:
print(f"Hour {hour:02d} not available")

Load and explore the data

Each Parquet file has a flat schema where every event type shares the same columns, with nulls where a field doesn't apply. The fields you care about most for backtesting are event_type, token_mint, action (buy or sell), timestamp, and the reserve fields.

import pandas as pd
import glob
# Load all hours into one DataFrame
files = sorted(glob.glob(f"data/{DATE}/*.parquet"))
df = pd.concat([pd.read_parquet(f) for f in files], ignore_index=True)
print(f"Total events: {len(df):,}")
print(f"Event types: {df['event_type'].value_counts().to_dict()}")
print(f"Unique tokens: {df['token_mint'].nunique():,}")
# Filter to swaps only
swaps = df[df["event_type"] == "swap"].copy()
print(f"Swaps: {len(swaps):,}")

Reconstruct token prices from reserves

Pump.fun's bonding curve doesn't have a traditional order book. Instead, each swap event includes the pool's virtual_lamports_reserve and virtual_token_reserve after the trade. You can get the price per token in SOL by dividing those two values. These virtual reserve fields only exist on pump_fun data, not pump_amm, so make sure you downloaded the right exchange.

LAMPORTS_PER_SOL = 1_000_000_000
def calc_price_sol(row):
"""Price per token in SOL from bonding curve reserves."""
if row["virtual_token_reserve"] == 0:
return None
return (
row["virtual_lamports_reserve"] / row["virtual_token_reserve"]
) / LAMPORTS_PER_SOL
swaps["price_sol"] = swaps.apply(calc_price_sol, axis=1)
swaps["time"] = pd.to_datetime(swaps["timestamp"], unit="s")

Now each swap row has a price_sol column you can use for charting or strategy logic.

Backtest a momentum strategy

Here's a simple momentum strategy. Buy a token when its price rises 50% from its first-seen price within the first 10 minutes of trading. Sell after a 2x gain or a 50% drawdown from peak, whichever comes first.

from dataclasses import dataclass
ENTRY_PUMP = 0.5 # buy after 50% rise from first price
TAKE_PROFIT = 2.0 # sell at 2x entry price
STOP_LOSS = 0.5 # sell at 50% drop from peak
WINDOW_SECONDS = 600 # look for entry in first 10 minutes
@dataclass
class Trade:
token: str
entry_price: float
exit_price: float
entry_time: float
exit_time: float
pnl_pct: float
def backtest_token(token_swaps: pd.DataFrame) -> Trade | None:
"""Run momentum strategy on a single token's swap history."""
token_swaps = token_swaps.sort_values("timestamp")
first_price = token_swaps.iloc[0]["price_sol"]
first_time = token_swaps.iloc[0]["timestamp"]
token = token_swaps.iloc[0]["token_mint"]
if first_price is None or first_price == 0:
return None
entry_price = None
entry_time = None
peak_price = 0.0
for _, row in token_swaps.iterrows():
price = row["price_sol"]
ts = row["timestamp"]
if price is None:
continue
# Phase 1: looking for entry
if entry_price is None:
if ts - first_time > WINDOW_SECONDS:
return None # no entry signal in window
if price >= first_price * (1 + ENTRY_PUMP):
entry_price = price
entry_time = ts
peak_price = price
continue
# Phase 2: manage position
peak_price = max(peak_price, price)
# Take profit
if price >= entry_price * (1 + TAKE_PROFIT):
return Trade(token, entry_price, price, entry_time, ts,
(price - entry_price) / entry_price)
# Stop loss (drawdown from peak)
if price <= peak_price * (1 - STOP_LOSS):
return Trade(token, entry_price, price, entry_time, ts,
(price - entry_price) / entry_price)
# Still holding at end of data
if entry_price is not None:
last_price = token_swaps.iloc[-1]["price_sol"]
return Trade(token, entry_price, last_price, entry_time,
token_swaps.iloc[-1]["timestamp"],
(last_price - entry_price) / entry_price)
return None

Run the backtest and analyze results

# Run on all tokens with enough swaps
trades = []
for token, group in swaps.groupby("token_mint"):
if len(group) < 10:
continue
result = backtest_token(group)
if result is not None:
trades.append(result)
print(f"Tokens traded: {len(trades)}")
# Compute stats
if trades:
pnls = [t.pnl_pct for t in trades]
winners = [p for p in pnls if p > 0]
losers = [p for p in pnls if p <= 0]
print(f"Win rate: {len(winners)/len(pnls)*100:.1f}%")
print(f"Avg win: {sum(winners)/len(winners)*100:.1f}%" if winners else "No winners")
print(f"Avg loss: {sum(losers)/len(losers)*100:.1f}%" if losers else "No losers")
print(f"Total PnL (equal-weight): {sum(pnls)/len(pnls)*100:.1f}%")

This gives you a quick read on whether the strategy has any edge. From here you can tweak the parameters or add filters like only entering tokens where the creator has launched successfully before. You can also layer in pump_amm data to track what happens after tokens graduate to the AMM.

Ideas for extending the backtest

  • Volume filters. Only enter tokens that have a minimum number of unique wallets or SOL volume in the first few minutes.
  • Creator analysis. Use the token_creator field to track which wallets launch tokens that perform well. Then filter future entries by creator track record.
  • AMM graduation. Combine pump_fun data with pump_amm data to backtest strategies that hold through the bonding curve graduation.
  • Multi-day runs. Download a week or month of data to get statistically significant results. A full week is 168 credits at 24 files per day.
  • Slippage modeling. Use the reserve fields to estimate how much slippage your order size would actually incur on the bonding curve.

Ready to build your own strategy?

PumpFunData has every Pump.fun and Pump.fun AMM swap since February 2026, in hourly Parquet files.