Comment on page
Quick Start
Use pip to install the required dependencies
pip install firstbatch
Add the FirstBatch API key to your environment. If you don't have one, go to the FirstBatch Dashboard.
export FIRSTBATCH_API_KEY="..."
Set up the vector database credentials as environment variables. We're using Pinecone for this example.
export PINECONE_API_KEY="..."
export PINECONE_ENV="..."
Later use the
os
package to access environment variables.import os
api_key = os.environ["PINECONE_API_KEY"]
env = os.environ["PINECONE_ENV"]
FirstBatch operates with a vector database. We initialize a vector database client.
import pinecone
pinecone.init(api_key=api_key, environment=env)
index = pinecone.Index("your_index_name")
Then to initialize FirstBatch SDK, we call FirstBatch class with our configuration.
from firstbatch import FirstBatch, Pinecone, Config, UserAction, Signal, AlgorithmLabel
config = Config(batch_size=20)
personalized = FirstBatch(api_key=os.environ["FIRSTBATCH_API_KEY"], config=config)
FirstBatch can operate asynchronously using:
from firstbatch import AsyncFirstBatch
FirstBatch creates a sketch of vector databases based on the data distribution of a small sample to provide exploration methods for the vector space. Here we call the
add_vdb
method, only required to run once for each index. It may take up to ~2m. personalized.add_vdb("my_db", Pinecone(index, embedding_size=1536)
We may add multiple vector databases with different indices and embedding sizes by calling the same method.
Sessions are the starting point for personalized experiences. All sessions and state changes are managed by FirstBatch. There are three fundamental operations to consider:
- 1.
Session
: A session object with an id, algorithm type, and vector database id. User embeddings are created per session. - 2.
Batch
: Get personalized data based on user embeddings of a specific session. - 3.
Signal
: Send data to update user embeddings of a specific session
We create our session by selecting the algorithm and the id of the vector database.
session = personalized.session(algorithm=AlgorithmLabel.SIMPLE, vdbid="my_db")
Get personalized results based on your algorithm
ids, batch = personalized.batch(session)
Update User Embeddings
personalized.add_signal(session, UserAction(Signal.LIKE), content_id)
Here is the complete code for creating a session with the "SIMPLE" type algorithm
from firstbatch import FirstBatch, Pinecone, Config, UserAction, Signal, AlgorithmLabel
import pinecone
import os
api_key = os.environ["PINECONE_API_KEY"]
env = os.environ["PINECONE_ENV"]
index_name = "rss"
pinecone.init(api_key=api_key, environment=env)
pinecone.describe_index(index_name)
index = pinecone.Index(index_name)
cfg = Config(batch_size=20, quantizer_train_size=100, quantizer_type="scalar",
enable_history=True, verbose=True)
personalized = FirstBatch(api_key=os.environ["FIRSTBATCH_API_KEY"], config=cfg)
personalized.add_vdb("pinecone_db_rss", Pinecone(index, embedding_size=1536))
session = personalized.session(algorithm=AlgorithmLabel.SIMPLE, vdbid=vdbid)
Last modified 17d ago