Skip to content

cross_validation

expanding_window_split(test_size, n_splits=5, step_size=1, eager=False)

Return train/test splits using expanding window splitter.

Split time series repeatedly into an growing training set and a fixed-size test set. For example, given test_size = 3, n_splits = 5 and step_size = 1, the train os and test xs folds can be visualized as:

| o o o x x x - - - - |
| o o o o x x x - - - |
| o o o o o x x x - - |
| o o o o o o x x x - |
| o o o o o o o x x x |

Parameters:

Name Type Description Default
test_size int

Number of test samples for each split.

required
n_splits int

Number of splits.

5
step_size int

Step size between windows.

1
eager bool

If True return DataFrames. Otherwise, return LazyFrames.

False

Returns:

Name Type Description
splitter Callable[LazyFrame, Mapping[int, Tuple[LazyFrame, LazyFrame]]]

Function that takes a panel LazyFrame and Dict of (train, test) splits, where the key represents the split number (1,2,...,n_splits) and the value is a tuple of LazyFrames.

sliding_window_split(test_size, n_splits=5, step_size=1, window_size=10, eager=False)

Return train/test splits using sliding window splitter. Split time series repeatedly into a fixed-length training and test set. For example, given test_size = 3, n_splits = 5, step_size = 1 and window_size=5 the train os and test xs folds can be visualized as:

| o o o o o x x x - - - - |
| - o o o o o x x x - - - |
| - - o o o o o x x x - - |
| - - - o o o o o x x x - |
| - - - - o o o o o x x x |

Parameters:

Name Type Description Default
test_size int

Number of test samples for each split.

required
n_splits int

Number of splits.

5
step_size int

Step size between windows.

1
window_size int

Window size for training.

10
eager bool

If True return DataFrames. Otherwise, return LazyFrames.

False

Returns:

Name Type Description
splitter Callable[LazyFrame, Mapping[int, Tuple[LazyFrame, LazyFrame]]]

Function that takes a panel LazyFrame and Dict of (train, test) splits, where the key represents the split number (1,2,...,n_splits) and the value is a tuple of LazyFrames.

train_test_split(test_size=0.25, eager=False)

train_test_split(test_size: Union[int, float] = ..., eager: Literal[True] = ...) -> Callable[[PolarsFrame], Tuple[pl.DataFrame, pl.DataFrame]]
train_test_split(test_size: Union[int, float] = ..., eager: Literal[False] = ...) -> Callable[[PolarsFrame], Tuple[pl.DataFrame, pl.DataFrame]]
train_test_split(test_size: Union[int, float] = ..., eager: bool = ...) -> Callable[[PolarsFrame], Union[Tuple[pl.DataFrame, pl.DataFrame], Tuple[pl.LazyFrame, pl.LazyFrame]]]

Return a time-ordered train set and test set given test_size.

Parameters:

Name Type Description Default
test_size int | float

Number or fraction of test samples.

0.25
eager bool

If True, evaluate immediately and returns tuple of train-test DataFrame.

False

Returns:

Name Type Description
splitter Union[EagerSplitter, LazySplitter]

Function that takes a panel DataFrame, or LazyFrame, and returns: * A tuple of train / test LazyFrames, if eager=False. * A tuple of train / test DataFrames, if eager=True.