When working with AI-generated content, streaming the response as it’s generated can significantly improve the user experience. Honcho provides streaming functionality in its SDKs that allows your application to display content as it’s being generated, rather than waiting for the complete response.
One of the primary use cases for streaming in Honcho is with the Dialectic endpoint. This allows you to stream the AI’s reasoning about a user in real-time.
from honcho import Honchohoncho = Honcho()# Create or get an existing Appapp = honcho.apps.get_or_create(name="demo-app")# Create or get useruser = honcho.apps.users.get_or_create(app_id=app.id, name="demo-user")# Create a new sessionsession = honcho.apps.users.sessions.create(app_id=app.id, user_id=user.id)# Store some messages for context (optional)honcho.apps.users.sessions.messages.create( app_id=app.id, user_id=user.id, session_id=session.id, content="Hello, I'm testing the streaming functionality", is_user=True)
import time# Basic streaming examplewith honcho.apps.users.sessions.with_streaming_response.stream( app_id=app.id, user_id=user.id, session_id=session.id, queries="What can you tell me about this user?",)as response:for chunk in response.iter_text():print(chunk, end="", flush=True)# Print each chunk as it arrives time.sleep(0.01)# Optional delay for demonstration
import asynciofrom honcho import Honchoasyncdefrestaurant_recommendation_chat(): honcho = Honcho() app =await honcho.apps.get_or_create(name="food-app") user =await honcho.apps.users.get_or_create(app_id=app.id, name="food-lover") session =await honcho.apps.users.sessions.create(app_id=app.id, user_id=user.id)# Store multiple user messages about food preferences user_messages =["I absolutely love spicy Thai food, especially curries with coconut milk.","Italian cuisine is another favorite - fresh pasta and wood-fired pizza are my weakness!","I try to eat vegetarian most of the time, but occasionally enjoy seafood.","I can't handle overly sweet desserts, but love something with dark chocolate."]# Store the user's messages in the sessionfor message in user_messages:await honcho.apps.users.sessions.messages.create( app_id=app.id, user_id=user.id, session_id=session.id, content=message, is_user=True)print(f"User: {message}")# Ask for restaurant recommendations based on preferencesprint("\nRequesting restaurant recommendations...")print("Assistant: ", end="", flush=True) full_response =""# Stream the responsewith honcho.apps.users.sessions.with_streaming_response.stream( app_id=app.id, user_id=user.id, session_id=session.id, queries="Based on this user's food preferences, recommend 3 restaurants they might enjoy in the Lower East Side.")as response:for chunk in response.iter_text():print(chunk, end="", flush=True) full_response += chunkawait asyncio.sleep(0.01)# Store the assistant's complete responseawait honcho.apps.users.sessions.messages.create( app_id=app.id, user_id=user.id, session_id=session.id, content=full_response, is_user=False)# Run the async functionif __name__ =="__main__": asyncio.run(restaurant_recommendation_chat())
Consider connection stability for mobile or unreliable networks
Implement appropriate timeouts for stream operations
Be mindful of memory usage when accumulating large responses
Use appropriate error handling for network interruptions
Streaming responses provide a more interactive and engaging user experience. By implementing streaming in your Honcho applications, you can create more responsive AI-powered features that feel natural and immediate to your users.