AI for Healthcare & Finance: From Medical Imaging to Algorithmic Trading

Overview
Part 1: Healthcare AI
Part 2: Finance AI
Key Considerations
- Healthcare
- Finance
References
Quiz

Overview

Artificial intelligence is driving unprecedented transformation in two of the most consequential industries: healthcare and finance. From deep learning models that assist radiologists in reading medical images, to algorithmic trading systems that execute thousands of transactions per second, AI has become core infrastructure in both fields. This guide explores key applications in each domain with practical, production-oriented code examples.

Part 1: Healthcare AI

1. Medical Imaging AI

Medical image analysis is one of the highest-impact areas for deep learning. CheXNet (Stanford) demonstrated radiologist-level pneumonia detection from chest X-rays, establishing a benchmark for AI-assisted diagnostics that has driven extensive follow-on research.

Processing DICOM Images

DICOM is the standard format for medical imaging. The following pipeline loads and preprocesses a DICOM file for inference.

import pydicom
import numpy as np
from torchvision import transforms
import torch

# Load DICOM image
ds = pydicom.dcmread('chest_xray.dcm')
img_array = ds.pixel_array.astype(np.float32)

# Normalize pixel values (min-max normalization)
img_normalized = (img_array - img_array.min()) / (img_array.max() - img_array.min())

# Preprocessing for model inference
transform = transforms.Compose([
    transforms.ToPILImage(),
    transforms.Resize((224, 224)),
    transforms.ToTensor(),
    transforms.Normalize([0.485], [0.229])
])

tensor = transform(img_normalized).unsqueeze(0)

Medical Image Segmentation with U-Net

U-Net is the dominant architecture for segmenting anatomical structures and pathologies in MRI and CT scans.

import torch
import torch.nn as nn

class DoubleConv(nn.Module):
    def __init__(self, in_ch, out_ch):
        super().__init__()
        self.conv = nn.Sequential(
            nn.Conv2d(in_ch, out_ch, 3, padding=1),
            nn.BatchNorm2d(out_ch),
            nn.ReLU(inplace=True),
            nn.Conv2d(out_ch, out_ch, 3, padding=1),
            nn.BatchNorm2d(out_ch),
            nn.ReLU(inplace=True)
        )
    def forward(self, x):
        return self.conv(x)

class UNet(nn.Module):
    def __init__(self, in_channels=1, num_classes=2):
        super().__init__()
        self.enc1 = DoubleConv(in_channels, 64)
        self.enc2 = DoubleConv(64, 128)
        self.pool = nn.MaxPool2d(2)
        self.bottleneck = DoubleConv(128, 256)
        self.up1 = nn.ConvTranspose2d(256, 128, 2, stride=2)
        self.dec1 = DoubleConv(256, 128)
        self.up2 = nn.ConvTranspose2d(128, 64, 2, stride=2)
        self.dec2 = DoubleConv(128, 64)
        self.out_conv = nn.Conv2d(64, num_classes, 1)

    def forward(self, x):
        e1 = self.enc1(x)
        e2 = self.enc2(self.pool(e1))
        b = self.bottleneck(self.pool(e2))
        d1 = self.dec1(torch.cat([self.up1(b), e2], dim=1))
        d2 = self.dec2(torch.cat([self.up2(d1), e1], dim=1))
        return self.out_conv(d2)

Evaluation Metrics: Dice Score and IoU (Intersection over Union) are the primary metrics for medical segmentation. Dice Score measures the overlap between prediction and ground truth, and is far more informative than accuracy when dealing with the severe foreground-background imbalance typical in medical images.

Whole Slide Images (WSI): Pathology slides can be gigapixel images. The standard approach is multi-instance learning (MIL) with patch extraction — the slide is divided into tiles, each tile is embedded by a CNN, and the tile embeddings are aggregated (e.g., attention-based pooling) to produce a slide-level prediction.

2. Clinical NLP

Electronic Health Records (EHR) contain vast amounts of unstructured text. NLP models extract clinically meaningful information from physician notes, discharge summaries, and radiology reports.

Medical Named Entity Recognition

from transformers import pipeline

# BioBERT-based NER pipeline
ner_pipeline = pipeline(
    "ner",
    model="dmis-lab/biobert-v1.1",
    tokenizer="dmis-lab/biobert-v1.1",
    aggregation_strategy="simple"
)

clinical_text = """
Patient presents with acute myocardial infarction.
Administered aspirin 325mg and clopidogrel 600mg.
ECG shows ST elevation in leads V1-V4.
"""

entities = ner_pipeline(clinical_text)
for ent in entities:
    print(f"Entity: {ent['word']}, Label: {ent['entity_group']}, Score: {ent['score']:.3f}")

Clinical Text Summarization

Med-PaLM 2 (Google) and BioMedLM (Stanford) are domain-specific large language models fine-tuned on clinical and biomedical text. They significantly outperform general-purpose models on tasks such as clinical question answering, discharge summary generation, and medical knowledge retrieval.

PhysioNet & MIMIC-III: The MIMIC-III Clinical Database (available at PhysioNet: https://physionet.org/content/mimiciii/) is the standard benchmark for clinical NLP research. It contains de-identified EHR data including discharge summaries, radiology reports, and nursing notes from over 40,000 ICU patients.

3. AI-Accelerated Drug Discovery

Traditional drug development takes 10-15 years and costs billions of dollars. AI is accelerating multiple stages of this pipeline.

Molecular Property Prediction with RDKit

from rdkit import Chem
from rdkit.Chem import Descriptors
import pandas as pd

def compute_molecular_features(smiles_list):
    features = []
    for smiles in smiles_list:
        mol = Chem.MolFromSmiles(smiles)
        if mol:
            features.append({
                'MW': Descriptors.MolWt(mol),
                'LogP': Descriptors.MolLogP(mol),
                'HBD': Descriptors.NumHDonors(mol),
                'HBA': Descriptors.NumHAcceptors(mol),
                'TPSA': Descriptors.TPSA(mol)
            })
    return pd.DataFrame(features)

def lipinski_filter(df):
    """Lipinski's Rule of Five: filter for oral bioavailability"""
    return df[
        (df['MW'] <= 500) &
        (df['LogP'] <= 5) &
        (df['HBD'] <= 5) &
        (df['HBA'] <= 10)
    ]

smiles_list = [
    'CC(=O)Oc1ccccc1C(=O)O',  # Aspirin
    'CC12CCC3C(C1CCC2O)CCC4=CC(=O)CCC34C',  # Testosterone
]
df = compute_molecular_features(smiles_list)
drug_candidates = lipinski_filter(df)

AlphaFold2 and Protein Structure Prediction

DeepMind's AlphaFold2 (https://alphafold.ebi.ac.uk/) predicts 3D protein structures from amino acid sequences with near-experimental accuracy. It has fundamentally changed structural biology by solving the protein folding problem — a grand challenge that resisted solution for over 50 years.

# ColabFold (AlphaFold2 interface) usage example
# pip install colabfold

from colabfold.batch import get_queries, run

queries = [("target_protein", "MKTIIALSYIFCLVFA")]

results = run(
    queries=queries,
    result_dir="./alphafold_results",
    use_templates=False,
    num_recycles=3,
    model_type="auto"
)

DeepMind has publicly released over 200 million predicted protein structures. Drug developers use these structures for virtual screening and structure-based drug design, dramatically reducing the time needed to identify promising binding sites.

Graph Neural Networks for Molecular Generation

GNNs represent molecules as graphs — atoms as nodes, bonds as edges — and learn chemical patterns for property prediction and de novo molecular generation.

import torch
from torch_geometric.data import Data
from torch_geometric.nn import GCNConv, global_mean_pool

class MoleculeGNN(torch.nn.Module):
    def __init__(self, num_features, hidden_dim, num_classes):
        super().__init__()
        self.conv1 = GCNConv(num_features, hidden_dim)
        self.conv2 = GCNConv(hidden_dim, hidden_dim)
        self.classifier = torch.nn.Linear(hidden_dim, num_classes)
        self.relu = torch.nn.ReLU()

    def forward(self, data):
        x, edge_index, batch = data.x, data.edge_index, data.batch
        x = self.relu(self.conv1(x, edge_index))
        x = self.relu(self.conv2(x, edge_index))
        x = global_mean_pool(x, batch)
        return self.classifier(x)

4. Wearables and AI

Consumer wearables generate continuous streams of physiological data that enable real-time health monitoring at scale.

ECG Arrhythmia Classification with 1D CNN

import torch
import torch.nn as nn

class ECGClassifier(nn.Module):
    """1D CNN for ECG arrhythmia classification"""
    def __init__(self, num_classes=5):
        super().__init__()
        self.features = nn.Sequential(
            nn.Conv1d(1, 32, kernel_size=11, padding=5),
            nn.BatchNorm1d(32),
            nn.ReLU(),
            nn.MaxPool1d(2),
            nn.Conv1d(32, 64, kernel_size=7, padding=3),
            nn.BatchNorm1d(64),
            nn.ReLU(),
            nn.MaxPool1d(2),
            nn.Conv1d(64, 128, kernel_size=5, padding=2),
            nn.BatchNorm1d(128),
            nn.ReLU(),
            nn.AdaptiveAvgPool1d(32)
        )
        self.classifier = nn.Sequential(
            nn.Linear(128 * 32, 256),
            nn.ReLU(),
            nn.Dropout(0.5),
            nn.Linear(256, num_classes)
        )

    def forward(self, x):
        x = self.features(x)
        x = x.view(x.size(0), -1)
        return self.classifier(x)

# PhysioNet MIT-BIH Arrhythmia Database
# Reference: https://physionet.org/content/mitdb/1.0.0/

Human Activity Recognition (HAR): Accelerometer and gyroscope data from wearables can classify activities such as walking, running, and climbing stairs. Transformer-based models have recently surpassed CNN/LSTM baselines on HAR benchmarks.

Part 2: Finance AI

5. Stock Price Prediction

Stock price prediction is one of the most studied but also most difficult problems in finance. The Efficient Market Hypothesis (EMH) posits that public information is already priced in — making consistent alpha generation from price data alone extremely challenging.

Feature Engineering with Technical Indicators

import yfinance as yf
import pandas as pd
import numpy as np

# Download historical price data
# Reference: https://pypi.org/project/yfinance/
ticker = yf.Ticker("AAPL")
df = ticker.history(period="5y")

def compute_rsi(prices, window=14):
    delta = prices.diff()
    gain = (delta.where(delta > 0, 0)).rolling(window).mean()
    loss = (-delta.where(delta < 0, 0)).rolling(window).mean()
    rs = gain / loss
    return 100 - (100 / (1 + rs))

def add_technical_indicators(df):
    df['SMA20'] = df['Close'].rolling(20).mean()
    df['SMA50'] = df['Close'].rolling(50).mean()
    df['EMA12'] = df['Close'].ewm(span=12).mean()
    df['EMA26'] = df['Close'].ewm(span=26).mean()
    df['MACD'] = df['EMA12'] - df['EMA26']
    df['Signal'] = df['MACD'].ewm(span=9).mean()
    df['RSI'] = compute_rsi(df['Close'], 14)
    df['BB_upper'] = df['SMA20'] + 2 * df['Close'].rolling(20).std()
    df['BB_lower'] = df['SMA20'] - 2 * df['Close'].rolling(20).std()
    df['Volatility'] = df['Close'].pct_change().rolling(20).std()
    df['Volume_MA'] = df['Volume'].rolling(20).mean()
    return df.dropna()

df = add_technical_indicators(df)

LSTM for Time Series Forecasting

import torch
import torch.nn as nn
from sklearn.preprocessing import MinMaxScaler

class StockLSTM(nn.Module):
    def __init__(self, input_size, hidden_size=128, num_layers=2, dropout=0.2):
        super().__init__()
        self.lstm = nn.LSTM(
            input_size=input_size,
            hidden_size=hidden_size,
            num_layers=num_layers,
            batch_first=True,
            dropout=dropout
        )
        self.fc = nn.Sequential(
            nn.Linear(hidden_size, 64),
            nn.ReLU(),
            nn.Linear(64, 1)
        )

    def forward(self, x):
        out, _ = self.lstm(x)
        return self.fc(out[:, -1, :])

# Walk-Forward cross-validation (prevents look-ahead bias)
def walk_forward_split(df, train_size=0.7, val_size=0.15):
    n = len(df)
    train_end = int(n * train_size)
    val_end = int(n * (train_size + val_size))
    return df[:train_end], df[train_end:val_end], df[val_end:]

Critical Warning: Never use random train/test splits on time series data. This introduces look-ahead bias — future data leaks into the training set — and produces backtesting results that are far better than what you would achieve in live trading.

6. Algorithmic Trading

Moving Average Crossover Strategy with Backtrader

import backtrader as bt
# Reference: https://www.backtrader.com/docu/

class MACrossStrategy(bt.Strategy):
    params = (('fast', 10), ('slow', 30),)

    def __init__(self):
        self.sma_fast = bt.indicators.SMA(period=self.p.fast)
        self.sma_slow = bt.indicators.SMA(period=self.p.slow)
        self.crossover = bt.indicators.CrossOver(self.sma_fast, self.sma_slow)

    def next(self):
        if not self.position:
            if self.crossover > 0:
                self.buy(size=100)
        elif self.crossover < 0:
            self.sell(size=100)

# Run backtest
cerebro = bt.Cerebro()
cerebro.addstrategy(MACrossStrategy)
cerebro.broker.setcash(100000.0)
cerebro.broker.setcommission(commission=0.001)
results = cerebro.run()
print(f"Final Portfolio Value: {cerebro.broker.getvalue():.2f}")

Markowitz Mean-Variance Portfolio Optimization

import numpy as np
from scipy.optimize import minimize

def portfolio_stats(weights, returns):
    port_return = np.sum(returns.mean() * weights) * 252
    port_vol = np.sqrt(weights @ returns.cov() @ weights * 252)
    sharpe = port_return / port_vol
    return port_return, port_vol, sharpe

def min_variance_portfolio(returns):
    n = returns.shape[1]
    constraints = {'type': 'eq', 'fun': lambda w: np.sum(w) - 1}
    bounds = [(0, 1)] * n
    result = minimize(
        lambda w: portfolio_stats(w, returns)[1],
        x0=np.ones(n) / n,
        bounds=bounds,
        constraints=constraints
    )
    return result.x

Reinforcement Learning for Trading (PPO)

import gym
import numpy as np
from stable_baselines3 import PPO

class TradingEnv(gym.Env):
    """Simplified stock trading environment for RL"""
    def __init__(self, df, initial_balance=10000):
        super().__init__()
        self.df = df
        self.initial_balance = initial_balance
        # Actions: 0=Hold, 1=Buy, 2=Sell
        self.action_space = gym.spaces.Discrete(3)
        self.observation_space = gym.spaces.Box(
            low=-np.inf, high=np.inf, shape=(10,), dtype=np.float32
        )
        self.reset()

    def reset(self):
        self.balance = self.initial_balance
        self.shares = 0
        self.current_step = 0
        return self._get_obs()

    def _get_obs(self):
        row = self.df.iloc[self.current_step]
        return np.array([
            row['Close'], row['SMA20'], row['SMA50'],
            row['RSI'], row['MACD'], row['Volatility'],
            self.balance, self.shares,
            row['Volume'], row['BB_upper'] - row['BB_lower']
        ], dtype=np.float32)

    def step(self, action):
        price = self.df.iloc[self.current_step]['Close']
        reward = 0
        if action == 1 and self.balance >= price:
            self.shares += 1
            self.balance -= price
        elif action == 2 and self.shares > 0:
            self.shares -= 1
            self.balance += price
            reward = price - self.df.iloc[max(0, self.current_step - 1)]['Close']
        self.current_step += 1
        done = self.current_step >= len(self.df) - 1
        return self._get_obs(), reward, done, {}

model = PPO("MlpPolicy", TradingEnv(df), verbose=1)
model.learn(total_timesteps=100000)

7. Fraud Detection

Credit card fraud detection involves extreme class imbalance: fraudulent transactions typically represent less than 0.1% of all transactions.

SMOTE Oversampling and Isolation Forest

from sklearn.ensemble import IsolationForest
from imblearn.over_sampling import SMOTE
from sklearn.metrics import classification_report, roc_auc_score
import numpy as np

# Oversample minority class — apply ONLY to training data
smote = SMOTE(sampling_strategy=0.1, random_state=42)
X_resampled, y_resampled = smote.fit_resample(X_train, y_train)

# Unsupervised anomaly detection
iso_forest = IsolationForest(
    n_estimators=200,
    contamination=0.001,
    random_state=42
)
iso_forest.fit(X_train)
anomaly_scores = iso_forest.decision_function(X_test)

# Model explainability with SHAP
import shap
explainer = shap.TreeExplainer(model)
shap_values = explainer.shap_values(X_test[:100])
shap.summary_plot(shap_values, X_test[:100], feature_names=feature_names)

Autoencoder-Based Anomaly Detection

Train an autoencoder only on normal transactions. At inference time, fraudulent transactions have high reconstruction error because the autoencoder was never trained to reconstruct them.

import torch
import torch.nn as nn

class FraudAutoencoder(nn.Module):
    def __init__(self, input_dim, encoding_dim=16):
        super().__init__()
        self.encoder = nn.Sequential(
            nn.Linear(input_dim, 64),
            nn.ReLU(),
            nn.Linear(64, 32),
            nn.ReLU(),
            nn.Linear(32, encoding_dim)
        )
        self.decoder = nn.Sequential(
            nn.Linear(encoding_dim, 32),
            nn.ReLU(),
            nn.Linear(32, 64),
            nn.ReLU(),
            nn.Linear(64, input_dim)
        )

    def forward(self, x):
        return self.decoder(self.encoder(x))

def detect_fraud(model, x, threshold=0.5):
    with torch.no_grad():
        reconstructed = model(x)
    reconstruction_error = torch.mean((x - reconstructed) ** 2, dim=1)
    return reconstruction_error > threshold

8. Credit Risk Modeling

Under Basel III, financial institutions must quantify credit risk. Expected Loss is computed as:

EL = PD x LGD x EAD
- PD (Probability of Default): likelihood the borrower will default
- LGD (Loss Given Default): fraction of exposure lost in default
- EAD (Exposure at Default): outstanding balance at time of default

from sklearn.linear_model import LogisticRegression
from sklearn.calibration import CalibratedClassifierCV
import lightgbm as lgb

# PD model with probability calibration
pd_model = LogisticRegression(class_weight='balanced', max_iter=1000)
pd_calibrated = CalibratedClassifierCV(pd_model, cv=5)
pd_calibrated.fit(X_train, y_train)
pd_scores = pd_calibrated.predict_proba(X_test)[:, 1]

# Gradient Boosting scorecard
lgb_model = lgb.LGBMClassifier(
    n_estimators=500,
    learning_rate=0.05,
    num_leaves=31,
    class_weight='balanced'
)
lgb_model.fit(X_train, y_train, eval_set=[(X_val, y_val)])

# Survival Analysis for time-to-default modeling
from lifelines import CoxPHFitter
cph = CoxPHFitter()
cph.fit(df_survival, duration_col='time_to_default', event_col='defaulted')
cph.print_summary()

9. NLP for Finance

News Sentiment Analysis with FinBERT

from transformers import pipeline
import pandas as pd

# FinBERT: financial domain BERT
sentiment_pipeline = pipeline(
    "sentiment-analysis",
    model="ProsusAI/finbert"
)

def analyze_news_sentiment(headlines):
    results = sentiment_pipeline(headlines)
    return pd.DataFrame([
        {'headline': h, 'label': r['label'], 'score': r['score']}
        for h, r in zip(headlines, results)
    ])

headlines = [
    "Apple reports record quarterly earnings",
    "Federal Reserve signals rate hike ahead",
    "Tech sector faces regulatory headwinds"
]

df_sentiment = analyze_news_sentiment(headlines)
print(df_sentiment)

Parsing SEC EDGAR Filings

import requests
from bs4 import BeautifulSoup
import re

def fetch_10k_section(cik, accession_number):
    """Retrieve Risk Factors section from an SEC 10-K filing"""
    base_url = "https://www.sec.gov/Archives/edgar"
    url = f"{base_url}/{cik}/{accession_number}/10k.htm"
    response = requests.get(url, headers={'User-Agent': 'research@example.com'})
    soup = BeautifulSoup(response.text, 'html.parser')
    risk_section = soup.find(text=re.compile("Risk Factors", re.IGNORECASE))
    return risk_section.parent.get_text() if risk_section else ""

ESG Scoring: LLMs are increasingly used to extract ESG (Environmental, Social, Governance) signals from sustainability reports, news articles, and earnings call transcripts. These signals feed directly into socially responsible investing (SRI) portfolios.

Key Considerations

Healthcare

Regulatory approval: In the US, AI software used for clinical decision-making falls under FDA's Software as a Medical Device (SaMD) framework and requires pre-market approval or clearance.
HIPAA compliance: Any system handling Protected Health Information (PHI) must implement de-identification, access control, and audit logging.
Algorithmic bias: Models trained on data skewed toward specific demographics may perform poorly for underrepresented groups — a direct patient safety concern.
Clinical validation: Strong in-silico performance does not guarantee real-world efficacy. Prospective clinical trials remain essential.

Finance

Look-ahead bias: Strict temporal discipline in data splits is non-negotiable. Even small leakage can make a worthless strategy appear profitable.
Overfitting: Financial data is noisy and non-stationary. Complex models often fail out-of-sample when market regimes shift.
Regime change: Strategies that worked well historically can fail suddenly when market structure, volatility, or correlations change.
Transaction costs: Backtests must model slippage, commission, and market impact. Ignoring these can make an unprofitable strategy look excellent.

References

PhysioNet Clinical Data: https://physionet.org/
AlphaFold2 (DeepMind): https://alphafold.ebi.ac.uk/
yfinance Documentation: https://pypi.org/project/yfinance/
Backtrader Documentation: https://www.backtrader.com/docu/
MIMIC-III Clinical Database: https://physionet.org/content/mimiciii/
SEC EDGAR: https://www.sec.gov/edgar/

Quiz

Q1. Why is look-ahead bias especially dangerous in financial backtesting?

Answer: It occurs when information unavailable at the time of the trading decision is used in model training or signal generation, producing artificially inflated backtest performance.

Explanation: For example, using end-of-day closing prices to generate same-day trading signals is look-ahead bias because you cannot trade at a price until it is known. Walk-forward validation and strict chronological splitting prevent this. Strategies built on leaked data may appear to generate significant alpha but will fail immediately in live trading, leading to substantial real financial losses.

Q2. Why is Dice Score preferred over accuracy for medical image segmentation?

Answer: Severe class imbalance between background and foreground pixels makes accuracy a misleading metric — a model that always predicts background achieves very high accuracy while being clinically useless.

Explanation: In a brain MRI where a tumor occupies 1% of voxels, always predicting "no tumor" achieves 99% accuracy. Dice Score is computed as 2 times TP divided by (2 times TP plus FP plus FN), which directly measures overlap between prediction and ground truth and correctly penalizes missing the small foreground class.

Q3. What is Lipinski's Rule of Five and why is it used in drug discovery?

Answer: An empirical guideline for filtering drug candidates likely to have acceptable oral bioavailability, based on four physicochemical properties: molecular weight under 500 Da, LogP under 5, hydrogen bond donors under 5, hydrogen bond acceptors under 10.

Explanation: Poor oral bioavailability is a major cause of late-stage clinical trial failure. Applying the Rule of Five early in the screening pipeline eliminates molecules that are unlikely to be absorbed, dramatically reducing the number of candidates that proceed to expensive in-vitro and in-vivo testing.

Q4. What is the key pitfall when applying SMOTE to fraud detection datasets?

Answer: SMOTE must be applied only to training data. Applying it to validation or test sets introduces synthetic samples that inflate performance metrics and do not reflect real-world distribution.

Explanation: Precision-Recall AUC and F1 Score are more appropriate evaluation metrics than overall accuracy for fraud detection because they explicitly measure performance on the minority (fraud) class. Even on the training set, SMOTE-generated samples may not capture the true distribution of fraud patterns, so the technique should be combined with robust out-of-sample evaluation.

Q5. Why was AlphaFold2 a breakthrough for drug discovery specifically?

Answer: AlphaFold2 predicts 3D protein structures from amino acid sequences at near-experimental accuracy, reducing structure determination from years of laboratory work to minutes of compute time.

Explanation: Structure-based drug design requires knowing the 3D shape of the target protein — especially binding pockets — to design molecules that fit precisely. Before AlphaFold2, structures had to be determined experimentally via X-ray crystallography or cryo-EM, which is slow, expensive, and not always successful. DeepMind has released over 200 million predicted structures in a public database, enabling virtual screening at unprecedented scale.