Feast
Feast is an open-source feature store designed to simplify and accelerate the management and serving of ML features.
It aims to provide a scalable and reliable platform for feature storage, retrieval, and serving, enabling efficient development and deployment of ML models.
Info!
Feast enables organizations to consistently define, store, and serve ML features and decouple ML from data infrastructure.
Install feast
Question! 1
Question! 2
Create folder and Initialize feast
Let's create a feast
project with:
Question! 3
Question! 4
Question! 5
Question! 6
Configure feature_store.yaml
In this file we must configure the data sources. For now we will use data stored locally.
Question! 7
Define feature repository
Question! 8
Important!
Change the path="/full/path/to/experiment/feature_repo/data/channels.parquet"
Enter the full path to the parquet file!
Click to see define_repo.py
source code
from datetime import timedelta
import pandas as pd
from feast import (
Entity,
FeatureService,
FeatureView,
Field,
FileSource,
PushSource,
RequestSource,
)
from feast.on_demand_feature_view import on_demand_feature_view
from feast.types import Int64, String
channel = Entity(name="channel", join_keys=["channel_id"])
channel_stats_source = FileSource(
name="channel_daily_stats_source",
path="/path/to/experiment/feature_repo/data/channels.parquet",
timestamp_field="date",
created_timestamp_column="created",
)
# Here we define a Feature View that will allow us to serve the
# channel data to our model online.
channel_stats_fv = FeatureView(
name="channel_daily_stats",
entities=[channel],
ttl=timedelta(days=1),
# The list of features defined below act as a schema to both define features
# for both materialization of features into a store, and are used as references
# during retrieval for building a training dataset or serving features
schema=[
Field(name="channel_name", dtype=String),
Field(name="k_subscribers", dtype=Int64),
Field(name="30_days_k_views", dtype=Int64, description="Average daily channel stats"),
],
online=True,
source=channel_stats_source,
# Tags are user defined key/value pairs that are attached to each
# feature view
tags={"team": "youtube_analytics"},
)
Apply changes
Question! 9
Use features
Let's see how to use the features. To do this, we will query historical data.
Question! 10
Click to see get_features.py
source code
from pprint import pprint
from feast import FeatureStore
from datetime import datetime
import pandas as pd
store = FeatureStore(repo_path="feature_repo")
# The keys and filters for the information we want to obtain.
entity_df = pd.DataFrame.from_dict(
{
"channel_id": [1, 1, 5],
"date": [
datetime(2023, 11, 7),
datetime(2023, 11, 6),
datetime(2023, 11, 7),
],
}
)
# The features we want to obtain.
feature_vector = store.get_historical_features(
entity_df=entity_df,
features=[
"channel_daily_stats:channel_name",
"channel_daily_stats:k_subscribers",
],
).to_df()
pprint(feature_vector)
Question! 11
Question! 12
Question! 13
User interface
We can also access the Feast Web interface. To do this, being in exp/feature_repo
, run:
This was an introduction to feast. The tool has many additional features. To learn more, visit https://docs.feast.dev/.