back

Capturing RF Signals with RTL-SDR and Python

A guide to capturing real-world RF signals from CubeSatSim using Python and RTL-SDR dongle

SDRPythonRFSignal ProcessingCubeSat

Overview

This system captures IQ (In-phase/Quadrature) samples from an RTL-SDR dongle, specifically targeting three signal types transmitted by CubeSatSim:

  • APRS - Automatic Packet Reporting System
  • FSK - Frequency Shift Keying
  • SSTV - Slow Scan Television

Hardware Requirements

  • RTL-SDR Dongle (RTL2832U chipset)
  • Antenna (appropriate for target frequency)
  • CubeSatSim transmitter (or access to real signals)
  • USB connection to computer

Software Dependencies

pip install pyrtlsdr numpy

Key Parameters

Signal Configuration

CENTER_FREQ = 434.9e6      # 434.9 MHz - CubeSatSim frequency
SAMPLE_RATE = 2.4e6        # 2.4 MHz sampling rate
CHUNK_SIZE = 256_000       # Read samples in chunks

Signal-Specific Settings

Each signal type has different characteristics: (Target samples are the number of samples I collected)

SignalDurationTarget SamplesPost-Capture Delay
APRS2.0s4030.0s
FSK2.0s400.5s
SSTV15.0s151.0s

Why different durations?

  • APRS/FSK: Short packet bursts (2 seconds captures the full transmission)
  • SSTV: Slow scan images take ~15 seconds to transmit

Why different delays?

  • APRS: Waits 30s for next packet (they're infrequent)
  • FSK: Continuous beacon, only 0.5s pause

Capturing RF Signals with RTL-SDR and Python

A practical guide to capturing real RF signals from CubeSatSim for machine learning classification.

  • SSTV: Images sent less frequently, 1s between captures

System Architecture

┌─────────────────┐
│  RTL-SDR Dongle │
│  (434.9 MHz)    │
└────────┬────────┘
         │ USB
         ↓
┌─────────────────────────────────┐
│  Python Script                  │
│  ┌───────────────────────────┐ │
│  │ 1. Initialize SDR         │ │
│  │ 2. Configure parameters   │ │
│  │ 3. Read IQ samples        │ │
│  │ 4. Save as .npy file      │ │
│  └───────────────────────────┘ │
└────────┬────────────────────────┘
         ↓
┌─────────────────────────────────┐
│  Saved Dataset                  │
│  └── aprs_2026-02-22.npy       │
│  └── fsk_2026-02-22.npy        │
│  └── sstv_2026-02-22.npy       │
└─────────────────────────────────┘

Complete Implementation

Step 1: Initialize SDR

from rtlsdr import RtlSdr
 
# Create SDR object
sdr = RtlSdr()
 
# Configure
sdr.sample_rate = 2.4e6      # 2.4 MHz
sdr.center_freq = 434.9e6    # 434.9 MHz
sdr.gain = 30                # Manual gain setting

Key Parameters:

  • sample_rate: How many samples per second (must be ≥2× max frequency of interest)
  • center_freq: What frequency to tune to
  • gain: Amplification (0-50, or 'auto'/ Increase gain if signal is too soft)

Step 2: Read IQ Samples in Chunks

def read_iq_samples(sdr, total_samples):
    """
    Read IQ samples in chunks to avoid memory issues
    
    Args:
        sdr: RTL-SDR object
        total_samples: Total number of samples to capture
    
    Returns:
        numpy array of complex IQ samples
    """
    samples = []
    remaining = total_samples
    
    while remaining > 0:
        # Read in chunks of 256k samples
        n = min(CHUNK_SIZE, remaining)
        chunk = sdr.read_samples(n)
        samples.append(chunk)
        remaining -= n
    
    # Combine all chunks
    return np.concatenate(samples)

Why chunks?

  • Large captures (15s × 2.4MHz = 36M samples) need chunking
  • Prevents memory overflow
  • More stable on slower systems

Step 3: Save as NumPy Array

import numpy as np
from datetime import datetime
 
# Calculate samples needed
duration = 2.0  # seconds
total_samples = int(SAMPLE_RATE * duration)
 
# Capture
iq_samples = read_iq_samples(sdr, total_samples)
 
# Create filename with timestamp
timestamp = datetime.now().strftime("%Y-%m-%d_%H-%M-%S")
filename = f"data/aprs_{timestamp}.npy"
 
# Save
np.save(filename, iq_samples)

File format:

  • .npy = NumPy binary format
  • Contains complex numbers (I + jQ)
  • Efficient storage and fast loading
  • Example: aprs_2026-02-22_14-30-45.npy

Step 4: Interactive Capture Loop

while True:
    # Show current progress
    print(f"APRS: {count_existing('aprs')}/40 samples")
    print(f"FSK:  {count_existing('fsk')}/40 samples")
    print(f"SSTV: {count_existing('sstv')}/15 samples")
    
    # User selects signal type
    signal_type = input("Select signal (aprs/fsk/sstv): ")
    
    # Wait for user to set CubeSatSim to correct mode
    input("Press ENTER when CubeSatSim is transmitting...")
    
    # Capture
    duration = DURATION_MAP[signal_type]
    total_samples = int(SAMPLE_RATE * duration)
    iq_samples = read_iq_samples(sdr, total_samples)
    
    # Save
    filename = f"{signal_type}_{timestamp}.npy"
    np.save(filename, iq_samples)
    
    # Wait for next transmission
    time.sleep(POST_CAPTURE_SLEEP[signal_type])

Understanding IQ Samples

What are IQ Samples?

IQ samples are complex numbers representing radio signals:

IQ Sample = I + jQ

Where:
  I = In-phase component (real part)
  Q = Quadrature component (imaginary part)

Visual representation:

     Q (Imaginary)
        ↑
        │    • Sample point
        │   (I + jQ)
        │  /
        │ /
        │/
────────┼────────→ I (Real)
       0│
        │

Why Complex Numbers?

  • Captures both amplitude and phase information
  • Allows frequency shifting in software
  • Essential for digital signal processing
  • Standard in SDR systems

Example IQ Sample:

# Load captured signal
samples = np.load("aprs_2024-12-22.npy")
 
# First sample
print(samples[0])
# Output: (0.015625+0.031250j)
 
# Extract components
i_component = np.real(samples)  # Real part
q_component = np.imag(samples)  # Imaginary part
 
# Magnitude and phase
magnitude = np.abs(samples)
phase = np.angle(samples)

Signal Characteristics

APRS (Automatic Packet Reporting System)

Modulation: AFSK (Audio FSK)
Frequency: 144.390 MHz (or your CubeSatSim freq)
Baud Rate: 1200 baud
Packet Duration: ~0.5-2 seconds
Transmission Pattern: Periodic bursts


FSK (Frequency Shift Keying)

Modulation: FSK
Frequency: 434.9 MHz (example)
Shift: ±5 kHz typical
Pattern: Continuous or burst


SSTV (Slow Scan Television)

Modulation: FM (frequency modulated audio)
Duration: 8-15 seconds per image
Pattern: Slow frequency sweeps
Contains: Image data as audio tones


Optimal Settings

Gain Settings

# Automatic gain (easiest)
sdr.gain = 'auto'
 
# Manual gain (better control)
sdr.gain = 30  # Range: 0-50
 
# Find optimal gain for your setup:
for gain in [10, 20, 30, 40]:
    sdr.gain = gain
    samples = sdr.read_samples(10000)
    print(f"Gain {gain}: avg power = {np.mean(np.abs(samples)**2)}")

Guidelines:

  • Start with gain = 30
  • Too low: Weak signal, lots of noise
  • Too high: Saturation, distortion
  • Optimal: Strong signal without clipping

Sample Rate Selection

# Minimum: 2× highest frequency component (Nyquist)
# For 200 kHz bandwidth → 400 kHz minimum
 
# Common rates:
sdr.sample_rate = 1.0e6   # 1 MHz
sdr.sample_rate = 2.0e6   # 2 MHz
sdr.sample_rate = 2.4e6   # 2.4 MHz (good balance)
sdr.sample_rate = 3.2e6   # 3.2 MHz (max for RTL-SDR)

Tradeoffs:

  • Higher rate = more detail but larger files
  • Lower rate = smaller files but might miss signal features
  • 2.4 MHz is a sweet spot for most signals

Troubleshooting

Problem: "No devices found"

# Check if dongle is detected
lsusb | grep Realtek
 
# Expected output:
Bus 001 Device 005: ID 0bda:2838 Realtek RTL2838 DVB-T
 
# If not found:
sudo apt-get install rtl-sdr

Unplug and replug dongle


### Problem: No signal captured

**Checklist:**
1.  Antenna connected?
2.  Correct frequency?
3.  CubeSatSim transmitting? (applies to any other transmitter also)
4.  Gain setting appropriate?


**Debug capture:**
```python
# Capture short sample
samples = sdr.read_samples(10000)

# Check if signal present
power = np.mean(np.abs(samples)**2)
print(f"Average power: {power}")

# Should be > 0.001 for signal
# < 0.0001 means no signal

Problem: USB disconnects

# Add error handling
try:
    samples = sdr.read_samples(total_samples)
except Exception as e:
    print(f"Error: {e}")
    sdr.close()
    sdr = RtlSdr()  # Reinitialize
    sdr.sample_rate = SAMPLE_RATE
    sdr.center_freq = CENTER_FREQ

Verifying Captured Data

Check File Integrity

import numpy as np
import matplotlib.pyplot as plt
 
# Load file
samples = np.load("data/aprs_2024-12-22.npy")
 
print(f"Sample count: {len(samples)}")
print(f"Data type: {samples.dtype}")
print(f"Duration: {len(samples) / SAMPLE_RATE:.2f} seconds")
print(f"File size: {os.path.getsize('data/aprs_2024-12-22.npy') / 1e6:.1f} MB")
 
# Plot first 1000 samples
plt.figure(figsize=(12, 4))
plt.plot(np.real(samples[:1000]), label='I (Real)', alpha=0.7)
plt.plot(np.imag(samples[:1000]), label='Q (Imag)', alpha=0.7)
plt.legend()
plt.title('IQ Samples - Time Domain')
plt.xlabel('Sample Number')
plt.ylabel('Amplitude')
plt.grid(True)
plt.show()

Expected output:

Sample count: 4800000
Data type: complex64
Duration: 2.00 seconds
File size: 38.4 MB

Quick Spectrogram Check

from scipy import signal
 
# Create spectrogram
f, t, Sxx = signal.spectrogram(
    samples,
    fs=SAMPLE_RATE,
    nperseg=512,
    noverlap=256
)
 
# Plot
plt.figure(figsize=(12, 6))
plt.pcolormesh(t, f/1e6, 10*np.log10(Sxx), shading='gouraud', cmap='viridis')
plt.ylabel('Frequency (MHz)')
plt.xlabel('Time (sec)')
plt.title('Quick Spectrogram Check')
plt.colorbar(label='Power (dB)')
plt.show()

What to look for:

  • Clear signal patterns (not just noise)
  • Expected frequency range
  • Signal visible above noise floor
  • Duration matches expected

Dataset Organization

Recommended Structure

data/
├── real_signals/
│   ├── aprs_2024-12-22_14-30-00.npy
│   ├── aprs_2024-12-22_14-32-00.npy
│   ├── ...
│   ├── fsk_2024-12-22_14-35-00.npy
│   ├── fsk_2024-12-22_14-35-30.npy
│   ├── ...
│   ├── sstv_2024-12-22_14-40-00.npy
│   └── sstv_2024-12-22_14-40-15.npy
└── metadata.txt  ← Record capture conditions

Metadata Logging

# Log capture conditions
with open("data/metadata.txt", "a") as f:
    f.write(f"{timestamp} | {signal_type} | "
            f"Freq: {CENTER_FREQ/1e6}MHz | "
            f"Gain: {sdr.gain} | "
            f"Samples: {len(samples)}\n")

Best Practices

1. Label Data Clearly

# Good naming:
"aprs_2026-02-22_14-30-00.npy"  
 
# Bad naming:
"signal1.npy"  
"test.npy"     

2. Capture Variety

  • Different times of day
  • Different transmit powers
  • Different noise conditions
  • Different signal strengths

3. Quality Over Quantity

  • 40 good samples > 100 poor samples
  • Verify each capture visually
  • Delete corrupted/empty captures
  • Keep consistent labeling

4. Document Everything

# At start of script
"""
CAPTURE SESSION: 2025-12-22
CubeSatSim Settings:
  - Mode: APRS
  - Power: 10 dBm
  - Frequency: 434.9 MHz
 
SDR Settings:
  - Gain: 30
  - Sample Rate: 2.4 MHz
  - Antenna: Dipole
 
Environment:
  - Location: Indoor lab
  - Distance: 3 meters
  - Interference: Minimal
"""

Next Steps

After capturing data:

  1. Convert to Spectrograms
   python src/2_create_spectrograms_real.py
  1. Train CNN Model
   python src/4_train_real.py
  1. Evaluate Performance
   python src/5_evaluate_real.py

Additional Resources

Key Takeaways

  1. IQ samples capture complete signal information (amplitude + phase)
  2. Chunked reading prevents memory issues with large captures
  3. Signal-specific parameters optimize for different transmission types
  4. Proper labeling is critical for machine learning
  5. Verification ensures data quality before training

Author: Himanshu Suri Date: Feb 2026

all posts73 · MK97FK