Crypto Trader — Analysis Module¶

Machine learning and market analytics engine¶

The Analysis module powers Crypto Trader’s price prediction and model training. It ingests market data from PostgreSQL, engineers features, trains LSTM‑based models (including multi‑layer variants), generates predictions, and reports results back to the platform.

Important: Past results and backtests do not guarantee future performance. Always start in paper mode. Liability is your own.

⭐️ What this module is¶

A Python analytics service used by Crypto Trader’s core platform.
Strategic ML engine: LSTM, complex LSTM, and multi‑layer LSTM variants.
Feature engineering pipeline with robust scaling and time‑windowed sequences.
Training, prediction, and reporting utilities that integrate with the Django API.

🧭 Key capabilities¶

Data access
Pulls market snapshots from PostgreSQL via SQLAlchemy; bulk COPY for speed.
Supports multiple query modes: current_price, historical, historical_spaced.
Feature engineering
Sliding time windows (configurable sequence_length).
Engineered channels: past trends, key timestamps, correlation with target returns.
Multi‑scale preprocessing for multi‑layer models (short/medium/long horizons).
Models and training
LSTM models with Keras/TensorFlow; early stopping, LR scheduling, checkpoints.
Multi‑layer models combine short, medium, and long sequences.
GPU execution when available; CPU fallback supported.
Predictions and reporting
Generates next‑step price predictions for a target currency.
Posts predictions and training session summaries to the platform API.
TensorBoard logs for model inspection.

🔗 How it fits into Crypto Trader¶

Input: Market snapshot data from the platform database (PostgreSQL).
Output: Trained model artifacts under models/..., predicted prices, metrics.
Integration: Sends structured payloads to the Django API.

🧰 Operation¶

This module is operated by project owners alongside the Django API and two GPUs for training.

Environment and data
Database: PostgreSQL
Market snapshots table must be populated; currencies.json is used for price column discovery.
GPU setup
Two‑GPU operation is supported. Utilities in the codebase and the scripts under gpu/ demonstrate splitting currency workloads across devices.
Example GPU launcher scripts live under gpu/ (e.g., gpu/run_gpu_zero.py).
Common operator actions
Train a single‑currency LSTM model
- Entry: src/apps/learning/models/training/training_session.py
Train a multi‑layer model (short/medium/long sequences)
- Entry: src/apps/learning/models/training/training_session.py
Make a one‑off prediction for a currency
- Entry: src/apps/learning/models/training/training_session.py
Prediction and reporting lifecycle
- Entry: src/apps/learning/models/training/training_session.py
- Note: TrainingSession.train() will compute a prediction after training and report it via the platform integration.
- Standalone utilities for sending predictions also exist under apps/learning/models/prediction/predictions.py if needed.
Run a GPU‑oriented launcher
- Scripts under gpu/ (e.g., gpu/run_gpu_zero.py, gpu/run_gpu_one.py).

Notes: - Training hyperparameters can be chosen via TrainingType (src/apps/learning/models/training/training_type.py) or by building a TrainingModel. - Multi‑layer models live under src/apps/learning/models/ai/lstm/layered/....

Configuration¶

1. 🏆 Optimal Configuration (Maximum Accuracy)¶

# Optimal Configuration for High-End Systems
# Maximizes learning capacity for 440-column feature vectors
dataset_size: MASSIVE          # 1,500,000 records (Current maximum enum value)
epochs: COMPLETE_ANALYSIS      # 200 epochs
batch_size: GENERALIZED        # 128 (Better for large feature sets on 16GB VRAM)
sequence_length: 20            # Increased for better temporal context
query_type: HISTORICAL_PRICE
model_type: COMPLEX_MULTI_LAYER
use_currency_generator: true
generator_use_cache: true
split_currencies_by_gpu: true  # Highly recommended for dual GPU setup

2. ⚡ High Performance / Balanced Configuration¶

# Balanced Configuration
# Faster training cycles with high accuracy
dataset_size: EXTRA_LARGE      # 600,000 records
epochs: STANDARD_ANALYSIS      # 50 epochs
batch_size: BALANCED           # 64
sequence_length: 15
query_type: HISTORICAL_PRICE
model_type: MULTI_LAYER
use_currency_generator: true
generator_use_cache: true
split_currencies_by_gpu: true

3. 🧪 Rapid Iteration / Testing Configuration¶

# Fast Testing Configuration
# Quick feedback on model convergence
dataset_size: STANDARD         # 225,000 records
epochs: SMALL_ANALYSIS         # 10 epochs
batch_size: DETAILED           # 32
sequence_length: 10
query_type: HISTORICAL_PRICE
model_type: LSTM
use_currency_generator: false
target_currency: BTC           # Focus on one currency for testing

🔒 Safety, privacy, and control¶

This module does not manage exchange API keys directly; it trains/predicts from database data.
Start with small datasets and paper trading.
Guardrails (e.g., position sizing, stop loss) are enforced by the trading engine, not this module.

🛠️ Technology in this module¶

Python 3
TensorFlow + Keras
NumPy, Pandas, scikit‑learn
SQLAlchemy (COPY to CSV optimization)
Requests (HTTP)
attrs / typing_extensions

❓ Questions or help¶

Email Oliver Lear Sigwarth (@theoliverlear): sigwarthsoftware@gmail.com

📄 License¶

See LICENSE.md in the repository root.