May 6, 2026
How to Backtest Pump.fun Trading Strategies with Python
Pump.fun tokens move fast. Most go to zero. A few do 100x. If you want to trade them systematically, you need historical data and a way to test your ideas before risking real capital. In this post we will download Pump.fun swap data, reconstruct token prices from on-chain reserves, and backtest a simple momentum strategy in Python.
What you need
- Python 3.10+ with
pandas,pyarrow, andrequests - A PumpFunData API key
pip install pandas pyarrow requestsDownload historical swap data
PumpFunData serves hourly Parquet files for two exchanges. pump_fun covers the bonding curve and pump_amm covers the graduated AMM. Each file contains every swap, token creation, and liquidity event for that hour.
Start by checking what date range is available.
import requests
API_KEY = "pfd_your_key_here"BASE = "https://api.pumpfundata.com"
# Check available datesresp = requests.get(f"{BASE}/range", params={"exchange": "pump_fun"}, headers={"X-API-Key": API_KEY})print(resp.json())# {"exchange": "pump_fun", "start": "...", "end": "...", "files": ...}Now download a full day of data. That means 24 hourly files, and each one costs 1 credit.
import os
DATE = "2026-04-15"OUT_DIR = f"data/{DATE}"os.makedirs(OUT_DIR, exist_ok=True)
for hour in range(24): resp = requests.get( f"{BASE}/download", params={"exchange": "pump_fun", "date": DATE, "hour": str(hour)}, headers={"X-API-Key": API_KEY}, ) if resp.status_code == 200: path = f"{OUT_DIR}/pump_fun_{DATE}_{hour:02d}.parquet" with open(path, "wb") as f: f.write(resp.content) print(f"Downloaded hour {hour:02d}") else: print(f"Hour {hour:02d} not available")Load and explore the data
Each Parquet file has a flat schema where every event type shares the same columns, with nulls where a field doesn't apply. The fields you care about most for backtesting are event_type, token_mint, action (buy or sell), timestamp, and the reserve fields.
import pandas as pdimport glob
# Load all hours into one DataFramefiles = sorted(glob.glob(f"data/{DATE}/*.parquet"))df = pd.concat([pd.read_parquet(f) for f in files], ignore_index=True)
print(f"Total events: {len(df):,}")print(f"Event types: {df['event_type'].value_counts().to_dict()}")print(f"Unique tokens: {df['token_mint'].nunique():,}")
# Filter to swaps onlyswaps = df[df["event_type"] == "swap"].copy()print(f"Swaps: {len(swaps):,}")Reconstruct token prices from reserves
Pump.fun's bonding curve doesn't have a traditional order book. Instead, each swap event includes the pool's virtual_lamports_reserve and virtual_token_reserve after the trade. You can get the price per token in SOL by dividing those two values. These virtual reserve fields only exist on pump_fun data, not pump_amm, so make sure you downloaded the right exchange.
LAMPORTS_PER_SOL = 1_000_000_000
def calc_price_sol(row): """Price per token in SOL from bonding curve reserves.""" if row["virtual_token_reserve"] == 0: return None return ( row["virtual_lamports_reserve"] / row["virtual_token_reserve"] ) / LAMPORTS_PER_SOL
swaps["price_sol"] = swaps.apply(calc_price_sol, axis=1)swaps["time"] = pd.to_datetime(swaps["timestamp"], unit="s")Now each swap row has a price_sol column you can use for charting or strategy logic.
Backtest a momentum strategy
Here's a simple momentum strategy. Buy a token when its price rises 50% from its first-seen price within the first 10 minutes of trading. Sell after a 2x gain or a 50% drawdown from peak, whichever comes first.
from dataclasses import dataclass
ENTRY_PUMP = 0.5 # buy after 50% rise from first priceTAKE_PROFIT = 2.0 # sell at 2x entry priceSTOP_LOSS = 0.5 # sell at 50% drop from peakWINDOW_SECONDS = 600 # look for entry in first 10 minutes
@dataclassclass Trade: token: str entry_price: float exit_price: float entry_time: float exit_time: float pnl_pct: float
def backtest_token(token_swaps: pd.DataFrame) -> Trade | None: """Run momentum strategy on a single token's swap history.""" token_swaps = token_swaps.sort_values("timestamp") first_price = token_swaps.iloc[0]["price_sol"] first_time = token_swaps.iloc[0]["timestamp"] token = token_swaps.iloc[0]["token_mint"]
if first_price is None or first_price == 0: return None
entry_price = None entry_time = None peak_price = 0.0
for _, row in token_swaps.iterrows(): price = row["price_sol"] ts = row["timestamp"] if price is None: continue
# Phase 1: looking for entry if entry_price is None: if ts - first_time > WINDOW_SECONDS: return None # no entry signal in window if price >= first_price * (1 + ENTRY_PUMP): entry_price = price entry_time = ts peak_price = price continue
# Phase 2: manage position peak_price = max(peak_price, price)
# Take profit if price >= entry_price * (1 + TAKE_PROFIT): return Trade(token, entry_price, price, entry_time, ts, (price - entry_price) / entry_price)
# Stop loss (drawdown from peak) if price <= peak_price * (1 - STOP_LOSS): return Trade(token, entry_price, price, entry_time, ts, (price - entry_price) / entry_price)
# Still holding at end of data if entry_price is not None: last_price = token_swaps.iloc[-1]["price_sol"] return Trade(token, entry_price, last_price, entry_time, token_swaps.iloc[-1]["timestamp"], (last_price - entry_price) / entry_price) return NoneRun the backtest and analyze results
# Run on all tokens with enough swapstrades = []for token, group in swaps.groupby("token_mint"): if len(group) < 10: continue result = backtest_token(group) if result is not None: trades.append(result)
print(f"Tokens traded: {len(trades)}")
# Compute statsif trades: pnls = [t.pnl_pct for t in trades] winners = [p for p in pnls if p > 0] losers = [p for p in pnls if p <= 0]
print(f"Win rate: {len(winners)/len(pnls)*100:.1f}%") print(f"Avg win: {sum(winners)/len(winners)*100:.1f}%" if winners else "No winners") print(f"Avg loss: {sum(losers)/len(losers)*100:.1f}%" if losers else "No losers") print(f"Total PnL (equal-weight): {sum(pnls)/len(pnls)*100:.1f}%")This gives you a quick read on whether the strategy has any edge. From here you can tweak the parameters or add filters like only entering tokens where the creator has launched successfully before. You can also layer in pump_amm data to track what happens after tokens graduate to the AMM.
Ideas for extending the backtest
- Volume filters. Only enter tokens that have a minimum number of unique wallets or SOL volume in the first few minutes.
- Creator analysis. Use the
token_creatorfield to track which wallets launch tokens that perform well. Then filter future entries by creator track record. - AMM graduation. Combine
pump_fundata withpump_ammdata to backtest strategies that hold through the bonding curve graduation. - Multi-day runs. Download a week or month of data to get statistically significant results. A full week is 168 credits at 24 files per day.
- Slippage modeling. Use the reserve fields to estimate how much slippage your order size would actually incur on the bonding curve.
Ready to build your own strategy?
PumpFunData has every Pump.fun and Pump.fun AMM swap since February 2026, in hourly Parquet files.