Skip to content

Configuration

This page documents the configuration schema consumed by SkyulfPipeline and FeatureEngineer.

Pipeline config

SkyulfPipeline expects:

{
  "preprocessing": [ ... ],
  "modeling": { ... }
}

Preprocessing config

The preprocessing list is executed in order.

Each step is:

{
  "name": "step_name",
  "transformer": "TransformerType",
  "params": { ... }
}

TransformerType is a string key resolved via the NodeRegistry. For the full list and per-node parameters, see:

  • Reference → Preprocessing Nodes
  • Reference → API → Preprocessing → pipeline

Minimal examples

# Split to avoid leakage
{"name": "split", "transformer": "TrainTestSplitter", "params": {"test_size": 0.2, "random_state": 42, "target_column": "target"}}
# Impute missing numeric values
{"name": "impute", "transformer": "SimpleImputer", "params": {"strategy": "mean", "columns": ["age"]}}
# Encode categoricals
{"name": "encode", "transformer": "OneHotEncoder", "params": {"columns": ["city"], "drop_original": True, "handle_unknown": "ignore"}}
# Scale numeric columns
{"name": "scale", "transformer": "StandardScaler", "params": {"auto_detect": True}}

Modeling config

SkyulfPipeline supports the following model types via the NodeRegistry.

Classification (9 models)

Key Algorithm
logistic_regression Logistic Regression
random_forest_classifier Random Forest Classifier
svc Support Vector Classifier
k_neighbors_classifier K-Nearest Neighbors Classifier
decision_tree_classifier Decision Tree Classifier
gradient_boosting_classifier Gradient Boosting Classifier
adaboost_classifier AdaBoost Classifier
xgboost_classifier XGBoost Classifier (requires skyulf-core[modeling-xgboost])
gaussian_nb Gaussian Naive Bayes

Regression (11 models)

Key Algorithm
linear_regression Linear Regression
ridge_regression Ridge Regression
lasso_regression Lasso Regression
elasticnet_regression ElasticNet Regression
random_forest_regressor Random Forest Regressor
svr Support Vector Regressor
k_neighbors_regressor K-Nearest Neighbors Regressor
decision_tree_regressor Decision Tree Regressor
gradient_boosting_regressor Gradient Boosting Regressor
adaboost_regressor AdaBoost Regressor
xgboost_regressor XGBoost Regressor (requires skyulf-core[modeling-xgboost])

Meta

Key Purpose
hyperparameter_tuner Wraps any model above with grid, random, Optuna, or halving search

Example:

{
  "type": "random_forest_classifier",
  "node_id": "model_node",
  "params": {
    "n_estimators": 200,
    "max_depth": 10
  }
}

Tuner example:

{
  "type": "hyperparameter_tuner",
  "base_model": {"type": "logistic_regression"},
  "strategy": "optuna",
  "search_space": {"C": [0.1, 1.0, 10.0]},
  "n_trials": 25,
  "metric": "accuracy"
}

See "Modeling Nodes" in Reference for details.