Extending Skyulf-Core
Skyulf-core is intentionally simple: calculators learn parameters, appliers apply them.
Add a new preprocessing node
- Create a new module in
skyulf.preprocessing. - Implement a
Calculatorand anApplier. - Register the node type string in the
FeatureEngineerdispatcher.
Example skeleton
from typing import Any, Dict, Tuple, Union
import pandas as pd
from skyulf.preprocessing.base import BaseApplier, BaseCalculator
from skyulf.utils import pack_pipeline_output, unpack_pipeline_input
class MyNodeCalculator(BaseCalculator):
def fit(
self,
df: Union[pd.DataFrame, Tuple[pd.DataFrame, pd.Series]],
config: Dict[str, Any],
) -> Dict[str, Any]:
X, _, _ = unpack_pipeline_input(df)
# Learn something from X...
return {"type": "my_node", "columns": config.get("columns", [])}
class MyNodeApplier(BaseApplier):
def apply(
self,
df: Union[pd.DataFrame, Tuple[pd.DataFrame, pd.Series]],
params: Dict[str, Any],
) -> Union[pd.DataFrame, Tuple[pd.DataFrame, pd.Series]]:
X, y, is_tuple = unpack_pipeline_input(df)
# Apply transformation...
return pack_pipeline_output(X, y, is_tuple)
Add a new modeling estimator
- Implement a new
BaseModelCalculatorandBaseModelApplier(or subclassSklearnCalculator/SklearnApplier). - Add a mapping entry in
SkyulfPipeline._init_model_estimator().
Testing guidance
Prefer integration tests that run:
Calculator.fit → Applier.apply on a small synthetic dataframe.