Powered By Blogger

Monday, March 10, 2025

Supercritical Fluid Extraction COMMERCIAL EQUIPMENT DESIGN

 Supercritical CO₂

Extraction Vessel


High-precision engineered vessel designed for commercial-scale extraction using supercritical CO₂, with a 100 kg/day throughput and 99.99% purity.




Technical Specifications

General

Purpose supercritical CO₂ extraction

Type cylindrical pressure vessel

Production Capacity100 kg/day throughput

Purity99.99%

Vessel Body

Outer Diameter400 mm

Height800 mm (incl. flange)

Wall Thickness: 15 mm

Material: Stainless Steel 316




Bottom Shape Hemispherical (dished)

Top Flange

Diameter400 mm

Thickness: 20 mm

Closure: 12 x M 20 bolts

Seal: PTFE O-ring (5 mm thick)




Features two lifting lugs

Internal Basket

Diameter: 350 mm

Height: 600 mm

Wall Thickness: 2 mm

Perforation: 1 mm holes, 5 mm spacing

Material: Stainless Steel 316



Heating Jacket

Outer Diameter430 mm

Thickness: 5 mm

Material: Stainless Steel 316



Fittings two 1/2" NPT ports

Coverage700 mm height

Ports & Fittings

Inlet Port 1" NPT at bottom center

Outlet Port 1" NPT on top flange

Relief Valve 1/2" NPT, 450 bar rated

Nozzle ID 10 mm



Performance Ratings

Pressure Rating450 bar

Operating Pressure200-400 bar

Temperature Rating Up to 100°C

Internal Volume~100 L




SC CO2- EXTRACTION-VESSEL

    SPECIFICATIONS:

    - Outer Diameter: 400 mm

    - Height: 800 mm

    - Wall Thickness: 15 mm

    - Internal Volume: 100 L

    - Material: Stainless Steel 316

    - Pressure Rating: 450 bar

    - Temperature Rating: 100°C






Precision Engineering

Engineered to ASME standards with premium SS316 construction, ensuring reliable performance at pressures up to 450 bar and temperatures of 100°C.




Modular Design

Featuring a removable top flange and internal basket for easy access and cleaning. Designed for seamless integration with downstream filtration systems.




Commercial Scale

With 100L capacity and 100 kg/day throughput, this vessel is optimized for commercial production while maintaining 99.99% extract purity.




Sunday, March 9, 2025

MD ML GNN SVM RANDOM FOREST GRADIENT BOOSTING CODE OUTPUT AND PLOT OF CURCUMIN+TP53

 CODE:

import torch
from torch_geometric.data import Data
from torch_geometric.nn import GCNConv
from torch_geometric.datasets import Planetoid
from torch.nn import Linear
import torch.nn.functional as F
from torch.optim import Adam
import optuna

# Example GNN model
class Net(torch.nn.Module):
    def __init__(self, num_node_features, num_classes):
        super(Net, self).__init__()
        self.conv1 = GCNConv(num_node_features, 16)
        self.conv2 = GCNConv(16, num_classes)

    def forward(self, data):
        x, edge_index = data.x, data.edge_index

        x = self.conv1(x, edge_index)
        x = F.relu(x)
        x = F.dropout(x, training=self.training)
        x = self.conv2(x, edge_index)

        return F.log_softmax(x, dim=1)

# Function to train the model
def train(model, device, data, optimizer, criterion):
    model.train()
    optimizer.zero_grad()
    out = model(data)
    loss = criterion(out[data.train_mask], data.y[data.train_mask])
    loss.backward()
    optimizer.step()
    return loss.item()

# Function to evaluate the model
def evaluate(model, device, data, criterion):
    model.eval()
    _, pred = model(data).max(dim=1)
    correct = int(pred[data.test_mask].eq(data.y[data.test_mask]).sum().item())
    return correct / int(data.test_mask.sum())

# Optuna objective function
def objective(trial):
    # Hyperparameter tuning space
    learning_rate = trial.suggest_loguniform('learning_rate', 1e-4, 1e-1)
    dropout = trial.suggest_uniform('dropout', 0.0, 0.5)
   
    # Load your dataset into a PyTorch Geometric Data object
    # For demonstration, we'll use a Planetoid dataset
    dataset = Planetoid(root='/tmp/Cora', name='Cora')
    data = dataset[0]
   
    device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
    model = Net(dataset.num_features, dataset.num_classes).to(device)
    data = data.to(device)
   
    criterion = torch.nn.NLLLoss()
    optimizer = Adam(model.parameters(), lr=learning_rate)
   
    for epoch in range(100):
        loss = train(model, device, data, optimizer, criterion)
   
    accuracy = evaluate(model, device, data, criterion)
   
    return accuracy

# Perform Optuna tuning
study = optuna.create_study(direction='maximize')
study.optimize(objective, n_trials=50)

# Print the best parameters and the corresponding accuracy
print(f"Best parameters: {study.best_params}")
print(f"Best accuracy: {study.best_value}")

OUTPUT:

Best parameters: {'learning_rate': 0.0012625055929391317, 'dropout': 0.2924881312590514} Best accuracy: 0.805

CODE:

# Create and run the study
study_rfr = optuna.create_study(direction='minimize')
study_rfr.optimize(objective_rfr, n_trials=50)

# Retrieve the best parameters
best_params_rfr = study_rfr.best_params

# Train the best model
rfr_best = RandomForestRegressor(**best_params_rfr, random_state=42)
rfr_best.fit(X, y)

# Make predictions and evaluate
y_pred_rfr = rfr_best.predict(X)
from sklearn.metrics import mean_squared_error
mse_rfr = mean_squared_error(y, y_pred_rfr)

# Output results
print(f"Optimized Random Forest MSE: {mse_rfr:.4f}")
print(f"Best Random Forest Parameters: {best_params_rfr}")

OUTPUT:

Optimized Random Forest MSE: 0.2499 Best Random Forest Parameters: {'n_estimators': 83, 'max_depth': 4, 'min_samples_split': 3, 'min_samples_leaf': 2, 'max_features': 'log2'}

CODE:

import torch
from torch_geometric.data import Data
from torch_geometric.nn import GCNConv
from torch_geometric.datasets import Planetoid
from torch.nn import Linear
import torch.nn.functional as F
from torch.optim import Adam
import optuna
import matplotlib.pyplot as plt

# Example GNN model
class Net(torch.nn.Module):
    def __init__(self, num_node_features, num_classes):
        super(Net, self).__init__()
        self.conv1 = GCNConv(num_node_features, 16)
        self.conv2 = GCNConv(16, num_classes)

    def forward(self, data):
        x, edge_index = data.x, data.edge_index

        x = self.conv1(x, edge_index)
        x = F.relu(x)
        x = F.dropout(x, training=self.training)
        x = self.conv2(x, edge_index)

        return F.log_softmax(x, dim=1)

# Function to train the model
def train(model, device, data, optimizer, criterion):
    model.train()
    optimizer.zero_grad()
    out = model(data)
    loss = criterion(out[data.train_mask], data.y[data.train_mask])
    loss.backward()
    optimizer.step()
    return loss.item()

# Function to evaluate the model
def evaluate(model, device, data, criterion):
    model.eval()
    _, pred = model(data).max(dim=1)
    correct = int(pred[data.test_mask].eq(data.y[data.test_mask]).sum().item())
    return correct / int(data.test_mask.sum())

# Optuna objective function
def objective(trial):
    # Hyperparameter tuning space
    learning_rate = trial.suggest_loguniform('learning_rate', 1e-4, 1e-1)
    dropout = trial.suggest_uniform('dropout', 0.0, 0.5)
   
    # Load your dataset into a PyTorch Geometric Data object
    # For demonstration, we'll use a Planetoid dataset
    dataset = Planetoid(root='/tmp/Cora', name='Cora')
    data = dataset[0]
   
    device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
    model = Net(dataset.num_features, dataset.num_classes).to(device)
    data = data.to(device)
   
    criterion = torch.nn.NLLLoss()
    optimizer = Adam(model.parameters(), lr=learning_rate)
   
    for epoch in range(100):
        loss = train(model, device, data, optimizer, criterion)
   
    accuracy = evaluate(model, device, data, criterion)
   
    return accuracy

# Perform Optuna tuning
study = optuna.create_study(direction='maximize')
study.optimize(objective, n_trials=50)

# Print the best parameters and the corresponding accuracy
print(f"Best parameters: {study.best_params}")
print(f"Best accuracy: {study.best_value}")

# Plot accuracy distribution
accuracies = [trial.value for trial in study.trials]
plt.figure(figsize=(8, 6))
plt.hist(accuracies, bins=10, alpha=0.7, color='blue', edgecolor='black')
plt.title('Distribution of Accuracy Values')
plt.xlabel('Accuracy')
plt.ylabel('Frequency')
plt.show()

# Plot accuracy vs. learning rate and dropout
import numpy as np

learning_rates = np.array([trial.params['learning_rate'] for trial in study.trials])
dropouts = np.array([trial.params['dropout'] for trial in study.trials])

plt.figure(figsize=(10, 5))

plt.subplot(1, 2, 1)
plt.scatter(learning_rates, accuracies)
plt.title('Accuracy vs. Learning Rate')
plt.xlabel('Learning Rate')
plt.ylabel('Accuracy')

plt.subplot(1, 2, 2)
plt.scatter(dropouts, accuracies)
plt.title('Accuracy vs. Dropout')
plt.xlabel('Dropout')
plt.ylabel('Accuracy')

plt.tight_layout()
plt.show()

Thursday, March 6, 2025

GNN ML Model code with optuna tuning for 6 SMILES and plot code

 CODE:


import deepchem as dc
import numpy as np
import torch
import torch.nn as nn
import torch.nn.functional as F
from deepchem.feat import ConvMolFeaturizer
from deepchem.data import NumpyDataset
from deepchem.models import GraphConvModel
import tensorflow as tf
import logging
from typing import List, Optional, Union, Tuple
import optuna
from sklearn.metrics import r2_score, mean_squared_error, mean_absolute_error, f1_score, accuracy_score, precision_score, recall_score
from sklearn.model_selection import train_test_split

# Set up logging
logging.basicConfig(level=logging.INFO, format='%(asctime)s - %(levelname)s - %(message)s')
logger = logging.getLogger(__name__)

# --- Monkey-Patching for Keras Compatibility ---
from deepchem.models.optimizers import Adam

def _create_tf_optimizer(self, global_step):
    try:
        if self.learning_rate is None:
            learning_rate = tf.keras.optimizers.schedules.ExponentialDecay(
                initial_learning_rate=self.initial_learning_rate,
                decay_steps=self.decay_steps,
                decay_rate=self.decay_rate,
                staircase=self.staircase
            )
        else:
            learning_rate = self.learning_rate
        return tf.keras.optimizers.Adam(
            learning_rate=learning_rate,
            beta_1=self.beta1,
            beta_2=self.beta2,
            epsilon=self.epsilon
        )
    except Exception as e:
        logger.error(f"Error creating optimizer: {e}")
        raise

Adam._create_tf_optimizer = _create_tf_optimizer

from deepchem.models.keras_model import KerasModel

def _create_inputs(self, example_inputs):
    try:
        self._ensure_built()
        keras_model = getattr(self.model, 'model', self.model)
        if hasattr(keras_model, 'inputs') and keras_model.inputs is not None:
            self._input_shapes = [t.shape for t in keras_model.inputs]
            self._input_dtypes = [t.dtype.name for t in keras_model.inputs]
        else:
            if isinstance(example_inputs, (list, tuple)):
                self._input_shapes = [np.shape(x) for x in example_inputs]
                self._input_dtypes = [x.dtype.name for x in example_inputs]
            else:
                self._input_shapes = [np.shape(example_inputs)]
                self._input_dtypes = [example_inputs.dtype.name]
        self._inputs_built = True
    except Exception as e:
        logger.error(f"Error in _create_inputs: {e}")
        raise

KerasModel._create_inputs = _create_inputs

# --- Enhanced Helper Function ---
def smiles_to_dataset(smiles_list: List[str], labels: Optional[Union[List, np.ndarray]] = None,
                     featurizer=ConvMolFeaturizer()) -> Tuple[NumpyDataset, Optional[np.ndarray]]:
    try:
        if not smiles_list or not all(isinstance(s, str) for s in smiles_list):
            raise ValueError("SMILES list must contain valid strings.")
        if labels is not None:
            if len(smiles_list) != len(labels):
                raise ValueError("SMILES and labels lists must have the same length.")
            labels = np.array(labels)

        mols = featurizer.featurize(smiles_list)
        valid_mols = []
        valid_labels = []

        for i, mol in enumerate(mols):
            if mol is not None and hasattr(mol, 'atom_features'):
                valid_mols.append(mol)
                if labels is not None:
                    valid_labels.append(labels[i])
            else:
                logger.warning(f"SMILES at index {i} failed to featurize: {smiles_list[i]}")

        if not valid_mols:
            raise ValueError("No valid SMILES strings were featurized.")

        if labels is not None:
            dataset = NumpyDataset(X=np.array(valid_mols, dtype=object), y=np.array(valid_labels))
            logger.info(f"Created dataset with {len(valid_mols)} valid molecules out of {len(smiles_list)}.")
            return dataset, np.array(valid_labels)
        else:
            dataset = NumpyDataset(X=np.array(valid_mols, dtype=object))
            logger.info(f"Created dataset with {len(valid_mols)} valid molecules out of {len(smiles_list)}.")
            return dataset, None
    except Exception as e:
        logger.error(f"Error in smiles_to_dataset: {e}")
        raise

# --- New SMILES List with 6 Molecules ---
smiles_list_6 = [
    "COc1cc(/C=C/C(=O)CC(=O)/C=C/c2ccc(c(c2)OC)O)ccc1O",
    "COC1=CC(\C=C\C(=O)CC(=O)\C=C\C2=CC=C(O)C(OC)=C2)=CC=C1O",
    "COC1=CC=C(\C=C\C(=O)CC(=O)\C=C\C2=CC=C(OC)C(OC)=C2)C=C1OC",
    "COC1=CC(CNC(=O)CCCC\C=C/C(C)C)=CC=C1O",
    "CCCCCCCCC(=O)NCC1=CC=C(O)C(OC)=C1",
    "CN(C)C1=CC2=C(C=C1)N=C3C=CC(=[N+](C)C)C=C3S2.[Cl-]"
]

train_smiles = smiles_list_6  # Use the 6 SMILES for training
train_class_labels = [1, 0, 1, 0, 1, 1]  # Example labels for 6 SMILES
train_reg_labels = [7.2, 6.9, 6.4, 6.3, 6.2, 6.1] # Example Regression labels

valid_smiles = smiles_list_6 # Use the same 6 SMILES for validation (for this example)
valid_class_labels = [1, 0, 1, 0, 1, 1] # Example validation labels
valid_reg_labels = [7.2, 6.9, 6.4, 6.3, 6.2, 6.1] # Example validation regression labels

test_smiles = smiles_list_6 # Use the same 6 SMILES for testing (for this example)
featurizer = ConvMolFeaturizer()

# Create Datasets
try:
    train_dataset_class, train_class_labels_filtered = smiles_to_dataset(train_smiles, train_class_labels, featurizer)
    train_dataset_reg, train_reg_labels_filtered = smiles_to_dataset(train_smiles, train_reg_labels, featurizer)
    valid_dataset_class, valid_class_labels_filtered = smiles_to_dataset(valid_smiles, valid_class_labels, featurizer)
    valid_dataset_reg, valid_reg_labels_filtered = smiles_to_dataset(valid_smiles, valid_reg_labels, featurizer)
    test_dataset, _ = smiles_to_dataset(test_smiles, None, featurizer)
except Exception as e:
    logger.error(f"Failed to create datasets: {e}")
    raise

# --- Classification Model (Unchanged) ---
def train_and_predict_class(train_dataset, valid_dataset, test_dataset):
    try:
        model = GraphConvModel(
            n_tasks=1,
            mode='classification',
            dropout=0.2,
            batch_normalize=False,
            model_dir='graphconv_model_classification_expanded',
            graph_conv_layers=[64, 64],
            dense_layer_size=128,
            batch_size=50
        )
        if hasattr(model.model, 'name'):
            model.model.name = 'graph_conv_classification_model_expanded'

        logger.info("Training classification model...")
        model.fit(train_dataset, nb_epoch=50)

        train_pred = model.predict(train_dataset)
        valid_pred = model.predict(valid_dataset)
        test_pred = model.predict(test_dataset)

        return model, train_pred, valid_pred, test_pred
    except Exception as e:
        logger.error(f"Error in classification training/prediction: {e}")
        raise

# --- Regression Model with Optuna Hyperparameter Tuning ---
def objective(trial):
    """Optuna objective function to maximize R^2 for regression."""
    try:
        # Define hyperparameter search space
        n_layers = trial.suggest_int('n_layers', 1, 3)  # Number of graph conv layers
        graph_conv_sizes = [trial.suggest_categorical('graph_conv_size_' + str(i), [32, 64, 128]) for i in range(n_layers)]
        dense_layer_size = trial.suggest_categorical('dense_layer_size', [64, 128, 256])
        dropout = trial.suggest_float('dropout', 0.0, 0.5)
        batch_size = trial.suggest_categorical('batch_size', [32, 50, 64])
        learning_rate = trial.suggest_float('learning_rate', 1e-5, 1e-3, log=True) # Added learning rate tuning

        # Create and train model
        model = GraphConvModel(
            n_tasks=1,
            mode='regression',
            dropout=dropout,
            batch_normalize=False,
            model_dir=f'graphconv_model_regression_trial_{trial.number}',
            graph_conv_layers=graph_conv_sizes,
            dense_layer_size=dense_layer_size,
            batch_size=batch_size,
            learning_rate=learning_rate # Set learning rate from trial
        )
        if hasattr(model.model, 'name'):
            model.model.name = f'graph_conv_regression_model_trial_{trial.number}'

        logger.info(f"Training regression model with trial {trial.number}...")
        model.fit(train_dataset_reg, nb_epoch=100, deterministic=False) # Increased epochs
       
        # Evaluate on validation set
        valid_pred = model.predict(valid_dataset_reg)
        r2 = r2_score(valid_reg_labels_filtered, valid_pred.flatten())

        return r2  # Maximize R^2
    except Exception as e:
        logger.error(f"Error in Optuna trial {trial.number}: {e}")
        return float('-inf')  # Return negative infinity for failed trials

def train_and_predict_reg_with_best_params(train_dataset, valid_dataset, test_dataset, best_params):
    """Train final regression model with best hyperparameters."""
    try:
        model = GraphConvModel(
            n_tasks=1,
            mode='regression',
            dropout=best_params['dropout'],
            batch_normalize=False,
            model_dir='graphconv_model_regression_expanded',
            graph_conv_layers=[best_params[f'graph_conv_size_{i}'] for i in range(best_params['n_layers'])],
            dense_layer_size=best_params['dense_layer_size'],
            batch_size=best_params['batch_size'],
            learning_rate=best_params['learning_rate'] # Use best learning rate
        )
        if hasattr(model.model, 'name'):
            model.model.name = 'graph_conv_regression_model_expanded'

        logger.info("Training final regression model with best parameters...")
        model.fit(train_dataset_reg, nb_epoch=100) # Increased epochs

        train_pred = model.predict(train_dataset)
        valid_pred = model.predict(valid_dataset)
        test_pred = model.predict(test_dataset)

        return model, train_pred, valid_pred, test_pred
    except Exception as e:
        logger.error(f"Error in regression training/prediction with best params: {e}")
        raise

# --- Evaluation Functions ---
def evaluate_classification(true_labels, pred_probs):
    try:
        pred_labels = np.argmax(pred_probs, axis=2).flatten()
        accuracy = accuracy_score(true_labels, pred_labels)
        precision = precision_score(true_labels, pred_labels, zero_division=0)
        recall = recall_score(true_labels, pred_labels, zero_division=0)
        f1 = f1_score(true_labels, pred_labels, zero_division=0)
        return accuracy, precision, recall, f1
    except Exception as e:
        logger.error(f"Error in classification evaluation: {e}")
        raise

def evaluate_regression(true_labels, pred_values):
    try:
        mae = mean_absolute_error(true_labels, pred_values.flatten())
        mse = mean_squared_error(true_labels, pred_values.flatten())
        r2 = r2_score(true_labels, pred_values.flatten())
        return mae, mse, r2
    except Exception as e:
        logger.error(f"Error in regression evaluation: {e}")
        raise

# --- Main Execution ---
def main():
    # Classification (unchanged)
    class_model, train_class_pred, valid_class_pred, test_class_pred = train_and_predict_class(
        train_dataset_class, valid_dataset_class, test_dataset
    )

    # Regression with Optuna tuning
    study = optuna.create_study(direction='maximize')
    logger.info("Starting Optuna hyperparameter optimization for regression...")
    study.optimize(objective, n_trials=50)  # Increased trials to 50

    logger.info(f"Best trial: {study.best_trial.number}")
    logger.info(f"Best R^2: {study.best_value}")
    logger.info(f"Best parameters: {study.best_params}")

    # Train final regression model with best parameters
    reg_model, train_reg_pred, valid_reg_pred, test_reg_pred = train_and_predict_reg_with_best_params(
        train_dataset_reg, valid_dataset_reg, test_dataset, study.best_params
    )

    # Print Predictions
    print("Training Classification Predictions (Probabilities):", train_class_pred)
    print("Validation Classification Predictions (Probabilities):", valid_class_pred)
    print("Test Classification Predictions (Probabilities):", test_class_pred)
    print("Training Regression Predictions:", train_reg_pred)
    print("Validation Regression Predictions:", valid_reg_pred)
    print("Test Regression Predictions:", test_reg_pred)

    # Evaluate Performance
    train_class_acc, train_class_prec, train_class_rec, train_class_f1 = evaluate_classification(train_class_labels_filtered, train_class_pred)
    valid_class_acc, valid_class_prec, valid_class_rec, valid_class_f1 = evaluate_classification(valid_class_labels_filtered, valid_class_pred)
    train_reg_mae, train_reg_mse, train_reg_r2 = evaluate_regression(train_reg_labels_filtered, train_reg_pred)
    valid_reg_mae, valid_reg_mse, valid_reg_r2 = evaluate_regression(valid_reg_labels_filtered, valid_reg_pred)

    print(f"--- Classification Metrics ---")
    print(f"Training Accuracy: {train_class_acc:.4f}, Precision: {train_class_prec:.4f}, Recall: {train_class_rec:.4f}, F1 Score: {train_class_f1:.4f}")
    print(f"Validation Accuracy: {valid_class_acc:.4f}, Precision: {valid_class_prec:.4f}, Recall: {valid_class_rec:.4f}, F1 Score: {valid_class_f1:.4f}")
    print(f"--- Regression Metrics ---")
    print(f"Training MAE: {train_reg_mae:.4f}, MSE: {train_reg_mse:.4f}, R^2: {train_reg_r2:.4f}")
    print(f"Validation MAE: {valid_reg_mae:.4f}, MSE: {valid_reg_mse:.4f}, R^2: {valid_reg_r2:.4f}")

if __name__ == "__main__":
    try:
        main()
    except Exception as e:
        logger.error(f"Main execution failed: {e}")
        raise


OUTPUT:

Training Classification Predictions (Probabilities): [[[0.28529257 0.71470743]] [[0.28529257 0.71470743]] [[0.26084998 0.73915005]] [[0.8086615 0.19133848]] [[0.12937962 0.8706204 ]] [[0.08971384 0.9102861 ]]] Validation Classification Predictions (Probabilities): [[[0.28529257 0.71470743]] [[0.28529257 0.71470743]] [[0.26084998 0.73915005]] [[0.8086615 0.19133848]] [[0.12937962 0.8706204 ]] [[0.08971384 0.9102861 ]]] Test Classification Predictions (Probabilities): [[[0.28529257 0.71470743]] [[0.28529257 0.71470743]] [[0.26084998 0.73915005]] [[0.8086615 0.19133848]] [[0.12937962 0.8706204 ]] [[0.08971384 0.9102861 ]]] Training Regression Predictions: [[6.8237925] [6.823792 ] [6.7207255] [6.3307014] [6.3301277] [6.082355 ]] Validation Regression Predictions: [[6.8237925] [6.823792 ] [6.7207255] [6.3307014] [6.3301277] [6.082355 ]] Test Regression Predictions: [[6.8237925] [6.823792 ] [6.7207255] [6.3307014] [6.3301277] [6.082355 ]] --- Classification Metrics --- Training Accuracy: 0.8333, Precision: 0.8000, Recall: 1.0000, F1 Score: 0.8889 Validation Accuracy: 0.8333, Precision: 0.8000, Recall: 1.0000, F1 Score: 0.8889 --- Regression Metrics --- Training MAE: 0.1586, MSE: 0.0447, R^2: 0.7170 Validation MAE: 0.1586, MSE: 0.0447, R^2: 0.7170

PLOT CODE:

import matplotlib.pyplot as plt # --- Modified Training Functions to Track Metrics per Epoch --- def train_and_predict_class_with_tracking(train_dataset, valid_dataset, test_dataset): try: model = GraphConvModel( n_tasks=1, mode='classification', dropout=0.2, batch_normalize=False, model_dir='graphconv_model_classification_expanded', graph_conv_layers=[64, 64], dense_layer_size=128, batch_size=50 ) if hasattr(model.model, 'name'): model.model.name = 'graph_conv_classification_model_expanded' train_accs = [] valid_accs = [] train_precs = [] valid_precs = [] train_recs = [] valid_recs = [] train_f1s = [] valid_f1s = [] for epoch in range(50): model.fit(train_dataset, nb_epoch=1) train_pred = model.predict(train_dataset) valid_pred = model.predict(valid_dataset) train_labels = train_class_labels_filtered valid_labels = valid_class_labels_filtered train_acc, train_prec, train_rec, train_f1 = evaluate_classification(train_labels, train_pred) valid_acc, valid_prec, valid_rec, valid_f1 = evaluate_classification(valid_labels, valid_pred) train_accs.append(train_acc) valid_accs.append(valid_acc) train_precs.append(train_prec) valid_precs.append(valid_prec) train_recs.append(train_rec) valid_recs.append(valid_rec) train_f1s.append(train_f1) valid_f1s.append(valid_f1) logger.info(f"Epoch {epoch+1}, Training Accuracy: {train_acc:.4f}, Validation Accuracy: {valid_acc:.4f}") test_pred = model.predict(test_dataset) return model, train_pred, valid_pred, test_pred, train_accs, valid_accs, train_precs, valid_precs, train_recs, valid_recs, train_f1s, valid_f1s except Exception as e: logger.error(f"Error in classification training/prediction with tracking: {e}") raise def train_and_predict_reg_with_tracking(train_dataset, valid_dataset, test_dataset, best_params): try: model = GraphConvModel( n_tasks=1, mode='regression', dropout=best_params['dropout'], batch_normalize=False, model_dir='graphconv_model_regression_expanded', graph_conv_layers=[best_params[f'graph_conv_size_{i}'] for i in range(best_params['n_layers'])], dense_layer_size=best_params['dense_layer_size'], batch_size=best_params['batch_size'], learning_rate=best_params['learning_rate'] ) if hasattr(model.model, 'name'): model.model.name = 'graph_conv_regression_model_expanded' train_maes = [] valid_maes = [] train_mses = [] valid_mses = [] train_r2s = [] valid_r2s = [] for epoch in range(100): model.fit(train_dataset_reg, nb_epoch=1) train_pred = model.predict(train_dataset_reg) valid_pred = model.predict(valid_dataset_reg) train_labels = train_reg_labels_filtered valid_labels = valid_reg_labels_filtered train_mae, train_mse, train_r2 = evaluate_regression(train_labels, train_pred) valid_mae, valid_mse, valid_r2 = evaluate_regression(valid_labels, valid_pred) train_maes.append(train_mae) valid_maes.append(valid_mae) train_mses.append(train_mse) valid_mses.append(valid_mse) train_r2s.append(train_r2) valid_r2s.append(valid_r2) logger.info(f"Epoch {epoch+1}, Training MAE: {train_mae:.4f}, Validation MAE: {valid_mae:.4f}") test_pred = model.predict(test_dataset) return model, train_pred, valid_pred, test_pred, train_maes, valid_maes, train_mses, valid_mses, train_r2s, valid_r2s except Exception as e: logger.error(f"Error in regression training/prediction with tracking: {e}") raise # --- Plotting Functions --- def plot_classification_metrics(train_accs, valid_accs, train_precs, valid_precs, train_recs, valid_recs, train_f1s, valid_f1s): epochs = range(len(train_accs)) plt.figure(figsize=(10, 6)) plt.subplot(2, 2, 1) plt.plot(epochs, train_accs, label='Training') plt.plot(epochs, valid_accs, label='Validation') plt.title('Accuracy') plt.xlabel('Epoch') plt.ylabel('Accuracy') plt.legend() plt.subplot(2, 2, 2) plt.plot(epochs, train_precs, label='Training') plt.plot(epochs, valid_precs, label='Validation') plt.title('Precision') plt.xlabel('Epoch') plt.ylabel('Precision') plt.legend() plt.subplot(2, 2, 3) plt.plot(epochs, train_recs, label='Training') plt.plot(epochs, valid_recs, label='Validation') plt.title('Recall') plt.xlabel('Epoch') plt.ylabel('Recall') plt.legend() plt.subplot(2, 2, 4) plt.plot(epochs, train_f1s, label='Training') plt.plot(epochs, valid_f1s, label='Validation') plt.title('F1 Score') plt.xlabel('Epoch') plt.ylabel('F1 Score') plt.legend() plt.tight_layout() plt.show() def plot_regression_metrics(train_maes, valid_maes, train_mses, valid_mses, train_r2s, valid_r2s): epochs = range(len(train_maes)) plt.figure(figsize=(10, 6)) plt.subplot(1, 3, 1) plt.plot(epochs, train_maes, label='Training') plt.plot(epochs, valid_maes, label='Validation') plt.title('MAE') plt.xlabel('Epoch') plt.ylabel('MAE') plt.legend() plt.subplot(1, 3, 2) plt.plot(epochs, train_mses, label='Training') plt.plot(epochs, valid_mses, label='Validation') plt.title('MSE') plt.xlabel('Epoch') plt.ylabel('MSE') plt.legend() plt.subplot(1, 3, 3) plt.plot(epochs, train_r2s, label='Training') plt.plot(epochs, valid_r2s, label='Validation') plt.title('R^2') plt.xlabel('Epoch') plt.ylabel('R^2') plt.legend() plt.tight_layout() plt.show() # --- Main Execution with Plotting --- def main(): # Classification with tracking class_model, train_class_pred, valid_class_pred, test_class_pred, train_accs, valid_accs, train_precs, valid_precs, train_recs, valid_recs, train_f1s, valid_f1s = train_and_predict_class_with_tracking( train_dataset_class, valid_dataset_class, test_dataset ) plot_classification_metrics(train_accs, valid_accs, train_precs, valid_precs, train_recs, valid_recs, train_f1s, valid_f1s) # Regression with Optuna tuning study = optuna.create_study(direction='maximize') logger.info("Starting Optuna hyperparameter optimization for regression...") study.optimize(objective, n_trials=50) logger.info(f"Best trial: {study.best_trial.number}") logger.info(f"Best R^2: {study.best_value}") logger.info(f"Best parameters: {study.best_params}") # Train final regression model with best parameters and tracking reg_model, train_reg_pred, valid_reg_pred, test_reg_pred, train_maes, valid_maes, train_mses, valid_mses, train_r2s, valid_r2s = train_and_predict_reg_with_best_params( train_dataset_reg, valid_dataset_reg, test_dataset, study.best_params ) plot_regression_metrics(train_maes, valid_maes, train_mses, valid_mses, train_r2s, valid_r2s) if __name__ == "__main__": try: main() except Exception as e: logger.error(f"Main execution failed: {e}") raise


[I 2025-03-07 06:47:04,654] A new study created in memory with name: no-name-73cbdaae-39b4-45f2-b753-e042409d3b2d [I 2025-03-07 06:47:19,086] Trial 0 finished with value: -153.81992792458863 and parameters: {'n_layers': 1, 'graph_conv_size_0': 64, 'dense_layer_size': 256, 'dropout': 0.11513263961316111, 'batch_size': 50, 'learning_rate': 1.8455220002784957e-05}. Best is trial 0 with value: -153.81992792458863. [I 2025-03-07 06:47:45,907] Trial 1 finished with value: -182.77093596848542 and parameters: {'n_layers': 2, 'graph_conv_size_0': 128, 'graph_conv_size_1': 32, 'dense_layer_size': 64, 'dropout': 0.08154584358859285, 'batch_size': 50, 'learning_rate': 2.4385469647123442e-05}. Best is trial 0 with value: -153.81992792458863. [I 2025-03-07 06:48:23,072] Trial 2 finished with value: -2.469782597768003 and parameters: {'n_layers': 3, 'graph_conv_size_0': 128, 'graph_conv_size_1': 32, 'graph_conv_size_2': 64, 'dense_layer_size': 256, 'dropout': 0.1926180783149567, 'batch_size': 32, 'learning_rate': 0.00017760931747335488}. Best is trial 2 with value: -2.469782597768003. [I 2025-03-07 06:48:53,067] Trial 3 finished with value: -194.5507483952223 and parameters: {'n_layers': 2, 'graph_conv_size_0': 32, 'graph_conv_size_1': 32, 'dense_layer_size': 64, 'dropout': 0.3924108370327444, 'batch_size': 64, 'learning_rate': 9.81321901415855e-05}. Best is trial 2 with value: -2.469782597768003. [I 2025-03-07 06:49:05,999] Trial 4 finished with value: -8.912298833058061 and parameters: {'n_layers': 1, 'graph_conv_size_0': 64, 'dense_layer_size': 64, 'dropout': 0.43354894424973206, 'batch_size': 50, 'learning_rate': 0.0003411539510545556}. Best is trial 2 with value: -2.469782597768003. [I 2025-03-07 06:49:31,402] Trial 5 finished with value: -233.68661251813694 and parameters: {'n_layers': 2, 'graph_conv_size_0': 128, 'graph_conv_size_1': 128, 'dense_layer_size': 64, 'dropout': 0.4042039121904243, 'batch_size': 32, 'learning_rate': 1.0169653442456986e-05}. Best is trial 2 with value: -2.469782597768003. [I 2025-03-07 06:50:50,623] Trial 6 finished with value: -70.68896339829045 and parameters: {'n_layers': 3, 'graph_conv_size_0': 128, 'graph_conv_size_1': 32, 'graph_conv_size_2': 64, 'dense_layer_size': 128, 'dropout': 0.12638973600493092, 'batch_size': 32, 'learning_rate': 4.026687778956459e-05}. Best is trial 2 with value: -2.469782597768003. [I 2025-03-07 06:51:34,138] Trial 7 finished with value: 0.5274826783102737 and parameters: {'n_layers': 3, 'graph_conv_size_0': 64, 'graph_conv_size_1': 128, 'graph_conv_size_2': 128, 'dense_layer_size': 64, 'dropout': 0.03744853181909019, 'batch_size': 32, 'learning_rate': 0.0006664021079484685}. Best is trial 7 with value: 0.5274826783102737. [I 2025-03-07 06:52:07,778] Trial 8 finished with value: 0.5306667168085206 and parameters: {'n_layers': 2, 'graph_conv_size_0': 32, 'graph_conv_size_1': 64, 'dense_layer_size': 256, 'dropout': 0.2315928108944607, 'batch_size': 50, 'learning_rate': 0.0002344115161862476}. Best is trial 8 with value: 0.5306667168085206. [I 2025-03-07 06:52:21,515] Trial 9 finished with value: -3.4516334688646193 and parameters: {'n_layers': 1, 'graph_conv_size_0': 64, 'dense_layer_size': 128, 'dropout': 0.36012057062341474, 'batch_size': 64, 'learning_rate': 0.0005291324813507126}. Best is trial 8 with value: 0.5306667168085206. [I 2025-03-07 06:52:52,140] Trial 10 finished with value: -10.34967438003083 and parameters: {'n_layers': 2, 'graph_conv_size_0': 32, 'graph_conv_size_1': 64, 'dense_layer_size': 256, 'dropout': 0.2775789183282005, 'batch_size': 50, 'learning_rate': 8.924539843417526e-05}. Best is trial 8 with value: 0.5306667168085206. [I 2025-03-07 06:54:29,434] Trial 11 finished with value: 0.7997784873626563 and parameters: {'n_layers': 3, 'graph_conv_size_0': 32, 'graph_conv_size_1': 128, 'graph_conv_size_2': 128, 'dense_layer_size': 256, 'dropout': 0.011976190232631589, 'batch_size': 32, 'learning_rate': 0.0008513922579837786}. Best is trial 11 with value: 0.7997784873626563. [I 2025-03-07 06:55:05,255] Trial 12 finished with value: -5.4915967449115906 and parameters: {'n_layers': 3, 'graph_conv_size_0': 32, 'graph_conv_size_1': 64, 'graph_conv_size_2': 32, 'dense_layer_size': 256, 'dropout': 0.2645556350228477, 'batch_size': 32, 'learning_rate': 0.0002420573272309627}. Best is trial 11 with value: 0.7997784873626563. [I 2025-03-07 06:55:31,157] Trial 13 finished with value: -0.6783974166354652 and parameters: {'n_layers': 2, 'graph_conv_size_0': 32, 'graph_conv_size_1': 64, 'dense_layer_size': 256, 'dropout': 0.19429802403708668, 'batch_size': 50, 'learning_rate': 0.000970734497980461}. Best is trial 11 with value: 0.7997784873626563.




The Untamed Spark: A Force to Reclaim Life

  In the quiet of a near-silent pulse, where breath falters and the world fades to a whisper, there lies a power—wild, relentless, and unbou...