Overview
This system captures IQ (In-phase/Quadrature) samples from an RTL-SDR dongle, specifically targeting three signal types transmitted by CubeSatSim:
- APRS - Automatic Packet Reporting System
- FSK - Frequency Shift Keying
- SSTV - Slow Scan Television
Hardware Requirements
- RTL-SDR Dongle (RTL2832U chipset)
- Antenna (appropriate for target frequency)
- CubeSatSim transmitter (or access to real signals)
- USB connection to computer
Software Dependencies
pip install pyrtlsdr numpyKey Parameters
Signal Configuration
CENTER_FREQ = 434.9e6 # 434.9 MHz - CubeSatSim frequency
SAMPLE_RATE = 2.4e6 # 2.4 MHz sampling rate
CHUNK_SIZE = 256_000 # Read samples in chunksSignal-Specific Settings
Each signal type has different characteristics: (Target samples are the number of samples I collected)
| Signal | Duration | Target Samples | Post-Capture Delay |
|---|---|---|---|
| APRS | 2.0s | 40 | 30.0s |
| FSK | 2.0s | 40 | 0.5s |
| SSTV | 15.0s | 15 | 1.0s |
Why different durations?
- APRS/FSK: Short packet bursts (2 seconds captures the full transmission)
- SSTV: Slow scan images take ~15 seconds to transmit
Why different delays?
- APRS: Waits 30s for next packet (they're infrequent)
- FSK: Continuous beacon, only 0.5s pause
Capturing RF Signals with RTL-SDR and Python
A practical guide to capturing real RF signals from CubeSatSim for machine learning classification.
- SSTV: Images sent less frequently, 1s between captures
System Architecture
┌─────────────────┐
│ RTL-SDR Dongle │
│ (434.9 MHz) │
└────────┬────────┘
│ USB
↓
┌─────────────────────────────────┐
│ Python Script │
│ ┌───────────────────────────┐ │
│ │ 1. Initialize SDR │ │
│ │ 2. Configure parameters │ │
│ │ 3. Read IQ samples │ │
│ │ 4. Save as .npy file │ │
│ └───────────────────────────┘ │
└────────┬────────────────────────┘
↓
┌─────────────────────────────────┐
│ Saved Dataset │
│ └── aprs_2026-02-22.npy │
│ └── fsk_2026-02-22.npy │
│ └── sstv_2026-02-22.npy │
└─────────────────────────────────┘
Complete Implementation
Step 1: Initialize SDR
from rtlsdr import RtlSdr
# Create SDR object
sdr = RtlSdr()
# Configure
sdr.sample_rate = 2.4e6 # 2.4 MHz
sdr.center_freq = 434.9e6 # 434.9 MHz
sdr.gain = 30 # Manual gain settingKey Parameters:
sample_rate: How many samples per second (must be ≥2× max frequency of interest)center_freq: What frequency to tune togain: Amplification (0-50, or 'auto'/ Increase gain if signal is too soft)
Step 2: Read IQ Samples in Chunks
def read_iq_samples(sdr, total_samples):
"""
Read IQ samples in chunks to avoid memory issues
Args:
sdr: RTL-SDR object
total_samples: Total number of samples to capture
Returns:
numpy array of complex IQ samples
"""
samples = []
remaining = total_samples
while remaining > 0:
# Read in chunks of 256k samples
n = min(CHUNK_SIZE, remaining)
chunk = sdr.read_samples(n)
samples.append(chunk)
remaining -= n
# Combine all chunks
return np.concatenate(samples)Why chunks?
- Large captures (15s × 2.4MHz = 36M samples) need chunking
- Prevents memory overflow
- More stable on slower systems
Step 3: Save as NumPy Array
import numpy as np
from datetime import datetime
# Calculate samples needed
duration = 2.0 # seconds
total_samples = int(SAMPLE_RATE * duration)
# Capture
iq_samples = read_iq_samples(sdr, total_samples)
# Create filename with timestamp
timestamp = datetime.now().strftime("%Y-%m-%d_%H-%M-%S")
filename = f"data/aprs_{timestamp}.npy"
# Save
np.save(filename, iq_samples)File format:
.npy= NumPy binary format- Contains complex numbers (I + jQ)
- Efficient storage and fast loading
- Example:
aprs_2026-02-22_14-30-45.npy
Step 4: Interactive Capture Loop
while True:
# Show current progress
print(f"APRS: {count_existing('aprs')}/40 samples")
print(f"FSK: {count_existing('fsk')}/40 samples")
print(f"SSTV: {count_existing('sstv')}/15 samples")
# User selects signal type
signal_type = input("Select signal (aprs/fsk/sstv): ")
# Wait for user to set CubeSatSim to correct mode
input("Press ENTER when CubeSatSim is transmitting...")
# Capture
duration = DURATION_MAP[signal_type]
total_samples = int(SAMPLE_RATE * duration)
iq_samples = read_iq_samples(sdr, total_samples)
# Save
filename = f"{signal_type}_{timestamp}.npy"
np.save(filename, iq_samples)
# Wait for next transmission
time.sleep(POST_CAPTURE_SLEEP[signal_type])Understanding IQ Samples
What are IQ Samples?
IQ samples are complex numbers representing radio signals:
IQ Sample = I + jQ
Where:
I = In-phase component (real part)
Q = Quadrature component (imaginary part)
Visual representation:
Q (Imaginary)
↑
│ • Sample point
│ (I + jQ)
│ /
│ /
│/
────────┼────────→ I (Real)
0│
│
Why Complex Numbers?
- Captures both amplitude and phase information
- Allows frequency shifting in software
- Essential for digital signal processing
- Standard in SDR systems
Example IQ Sample:
# Load captured signal
samples = np.load("aprs_2024-12-22.npy")
# First sample
print(samples[0])
# Output: (0.015625+0.031250j)
# Extract components
i_component = np.real(samples) # Real part
q_component = np.imag(samples) # Imaginary part
# Magnitude and phase
magnitude = np.abs(samples)
phase = np.angle(samples)Signal Characteristics
APRS (Automatic Packet Reporting System)
Modulation: AFSK (Audio FSK)
Frequency: 144.390 MHz (or your CubeSatSim freq)
Baud Rate: 1200 baud
Packet Duration: ~0.5-2 seconds
Transmission Pattern: Periodic bursts
FSK (Frequency Shift Keying)
Modulation: FSK
Frequency: 434.9 MHz (example)
Shift: ±5 kHz typical
Pattern: Continuous or burst
SSTV (Slow Scan Television)
Modulation: FM (frequency modulated audio)
Duration: 8-15 seconds per image
Pattern: Slow frequency sweeps
Contains: Image data as audio tones
Optimal Settings
Gain Settings
# Automatic gain (easiest)
sdr.gain = 'auto'
# Manual gain (better control)
sdr.gain = 30 # Range: 0-50
# Find optimal gain for your setup:
for gain in [10, 20, 30, 40]:
sdr.gain = gain
samples = sdr.read_samples(10000)
print(f"Gain {gain}: avg power = {np.mean(np.abs(samples)**2)}")Guidelines:
- Start with
gain = 30 - Too low: Weak signal, lots of noise
- Too high: Saturation, distortion
- Optimal: Strong signal without clipping
Sample Rate Selection
# Minimum: 2× highest frequency component (Nyquist)
# For 200 kHz bandwidth → 400 kHz minimum
# Common rates:
sdr.sample_rate = 1.0e6 # 1 MHz
sdr.sample_rate = 2.0e6 # 2 MHz
sdr.sample_rate = 2.4e6 # 2.4 MHz (good balance)
sdr.sample_rate = 3.2e6 # 3.2 MHz (max for RTL-SDR)Tradeoffs:
- Higher rate = more detail but larger files
- Lower rate = smaller files but might miss signal features
- 2.4 MHz is a sweet spot for most signals
Troubleshooting
Problem: "No devices found"
# Check if dongle is detected
lsusb | grep Realtek
# Expected output:
Bus 001 Device 005: ID 0bda:2838 Realtek RTL2838 DVB-T
# If not found:
sudo apt-get install rtl-sdrUnplug and replug dongle
### Problem: No signal captured
**Checklist:**
1. Antenna connected?
2. Correct frequency?
3. CubeSatSim transmitting? (applies to any other transmitter also)
4. Gain setting appropriate?
**Debug capture:**
```python
# Capture short sample
samples = sdr.read_samples(10000)
# Check if signal present
power = np.mean(np.abs(samples)**2)
print(f"Average power: {power}")
# Should be > 0.001 for signal
# < 0.0001 means no signal
Problem: USB disconnects
# Add error handling
try:
samples = sdr.read_samples(total_samples)
except Exception as e:
print(f"Error: {e}")
sdr.close()
sdr = RtlSdr() # Reinitialize
sdr.sample_rate = SAMPLE_RATE
sdr.center_freq = CENTER_FREQVerifying Captured Data
Check File Integrity
import numpy as np
import matplotlib.pyplot as plt
# Load file
samples = np.load("data/aprs_2024-12-22.npy")
print(f"Sample count: {len(samples)}")
print(f"Data type: {samples.dtype}")
print(f"Duration: {len(samples) / SAMPLE_RATE:.2f} seconds")
print(f"File size: {os.path.getsize('data/aprs_2024-12-22.npy') / 1e6:.1f} MB")
# Plot first 1000 samples
plt.figure(figsize=(12, 4))
plt.plot(np.real(samples[:1000]), label='I (Real)', alpha=0.7)
plt.plot(np.imag(samples[:1000]), label='Q (Imag)', alpha=0.7)
plt.legend()
plt.title('IQ Samples - Time Domain')
plt.xlabel('Sample Number')
plt.ylabel('Amplitude')
plt.grid(True)
plt.show()Expected output:
Sample count: 4800000
Data type: complex64
Duration: 2.00 seconds
File size: 38.4 MB
Quick Spectrogram Check
from scipy import signal
# Create spectrogram
f, t, Sxx = signal.spectrogram(
samples,
fs=SAMPLE_RATE,
nperseg=512,
noverlap=256
)
# Plot
plt.figure(figsize=(12, 6))
plt.pcolormesh(t, f/1e6, 10*np.log10(Sxx), shading='gouraud', cmap='viridis')
plt.ylabel('Frequency (MHz)')
plt.xlabel('Time (sec)')
plt.title('Quick Spectrogram Check')
plt.colorbar(label='Power (dB)')
plt.show()What to look for:
- Clear signal patterns (not just noise)
- Expected frequency range
- Signal visible above noise floor
- Duration matches expected
Dataset Organization
Recommended Structure
data/
├── real_signals/
│ ├── aprs_2024-12-22_14-30-00.npy
│ ├── aprs_2024-12-22_14-32-00.npy
│ ├── ...
│ ├── fsk_2024-12-22_14-35-00.npy
│ ├── fsk_2024-12-22_14-35-30.npy
│ ├── ...
│ ├── sstv_2024-12-22_14-40-00.npy
│ └── sstv_2024-12-22_14-40-15.npy
└── metadata.txt ← Record capture conditions
Metadata Logging
# Log capture conditions
with open("data/metadata.txt", "a") as f:
f.write(f"{timestamp} | {signal_type} | "
f"Freq: {CENTER_FREQ/1e6}MHz | "
f"Gain: {sdr.gain} | "
f"Samples: {len(samples)}\n")Best Practices
1. Label Data Clearly
# Good naming:
"aprs_2026-02-22_14-30-00.npy"
# Bad naming:
"signal1.npy"
"test.npy" 2. Capture Variety
- Different times of day
- Different transmit powers
- Different noise conditions
- Different signal strengths
3. Quality Over Quantity
- 40 good samples > 100 poor samples
- Verify each capture visually
- Delete corrupted/empty captures
- Keep consistent labeling
4. Document Everything
# At start of script
"""
CAPTURE SESSION: 2025-12-22
CubeSatSim Settings:
- Mode: APRS
- Power: 10 dBm
- Frequency: 434.9 MHz
SDR Settings:
- Gain: 30
- Sample Rate: 2.4 MHz
- Antenna: Dipole
Environment:
- Location: Indoor lab
- Distance: 3 meters
- Interference: Minimal
"""Next Steps
After capturing data:
- Convert to Spectrograms
python src/2_create_spectrograms_real.py- Train CNN Model
python src/4_train_real.py- Evaluate Performance
python src/5_evaluate_real.pyAdditional Resources
- pyrtlsdr Documentation: https://pyrtlsdr.readthedocs.io/
- RTL-SDR Blog: https://www.rtl-sdr.com/
- DSP Guide: http://www.dspguide.com/
- GNU Radio Tutorials: https://wiki.gnuradio.org/
Key Takeaways
- IQ samples capture complete signal information (amplitude + phase)
- Chunked reading prevents memory issues with large captures
- Signal-specific parameters optimize for different transmission types
- Proper labeling is critical for machine learning
- Verification ensures data quality before training
Author: Himanshu Suri Date: Feb 2026