Getting Started
The fastest path from zero to a working Skyulf pipeline.
1. Install
pip install skyulf-core
Or install with all optional extras:
pip install skyulf-core[viz,eda,tuning,modeling-xgboost,preprocessing-imbalanced]
For editable installs and Docker, see Installation.
2. Minimal example
import pandas as pd
from skyulf.pipeline import SkyulfPipeline
df = pd.DataFrame({
"age": [10, 20, None, 40, 50, 60, None, 80],
"city": ["A", "B", "A", "C", "B", "A", "C", "B"],
"target": [0, 1, 0, 1, 1, 0, 1, 0],
})
config = {
"preprocessing": [
{"name": "split", "transformer": "TrainTestSplitter",
"params": {"test_size": 0.25, "random_state": 42,
"stratify": True, "target_column": "target"}},
{"name": "impute", "transformer": "SimpleImputer",
"params": {"columns": ["age"], "strategy": "mean"}},
{"name": "encode", "transformer": "OneHotEncoder",
"params": {"columns": ["city"], "drop_original": True}},
],
"modeling": {
"type": "random_forest_classifier",
"params": {"n_estimators": 50, "random_state": 42},
},
}
pipeline = SkyulfPipeline(config)
metrics = pipeline.fit(df, target_column="target")
print(metrics)
preds = pipeline.predict(df.drop(columns=["target"]))
print(preds.head())
3. What just happened?
- TrainTestSplitter separated data into train/test sets (no leakage).
- SimpleImputer learned the mean of
agefrom training data only. - OneHotEncoder created dummy columns for
city. - RandomForestClassifier trained on the processed training split.
4. Next steps
| Goal | Page |
|---|---|
| Full train / evaluate / save / load workflow | Pipeline Quickstart |
| All 20 supported models and config keys | Configuration |
| Hyperparameter tuning (grid, random, Optuna) | Hyperparameter Tuning |
| Add your own custom nodes | Extending Skyulf-Core |
| Full platform setup (backend + UI) | Architecture |