1. Overview
LLM Vision is a Home Assistant add-on that uses AI to analyze images, videos, and live camera streams. It can detect objects, summarize activities, and even store past events in an event calendar. Here’s what it can do for you:
- Image Analysis (e.g., analyzing a snapshot or a camera entity).
- Video Analysis (e.g., analyzing video files or Frigate events).
- Stream Analysis (e.g., recording a live camera feed for a few seconds).
- Data Analysis (e.g., extracting values from images/charts to update sensors).
- Remembering Events (saving analysis results into a calendar entity for later review).
Combine LLM Vision with your security cameras, motion sensors, or Frigate events to receive detailed AI-generated summaries and notifications.
2. Installation & Configuration
- Install the LLM Vision Add-on
- Go to Settings > Add-ons in Home Assistant.
- Search for LLM Vision and install it.
- Set Up LLM Vision in Integrations
- Go to Settings > Devices & Services (Integrations tab).
- Click Add Integration and find LLM Vision.
- If prompted, choose Event Calendar to store events.
- Select how long events should be retained. (Set 0 to never auto-delete.)
- Configure an AI Provider
- In Settings > Devices & Services, find LLM Vision and open its configuration.
- Add or configure at least one AI provider (e.g., OpenAI, local large language model, etc.).
- Note the provider ID (or simply select it from the UI dropdown when calling LLM Vision actions).
- Optional: LLM Vision Events
- If you enable this, a new calendar entity (usually
calendar.llm_vision_events
) is created. - Any analysis called with
remember: true
will store an entry here.
- If you enable this, a new calendar entity (usually
3. Remembering Events (Optional)
- Enabling: If you have configured LLM Vision Events, simply pass
remember: true
in any image_analyzer, video_analyzer, or stream_analyzer call. - Viewing Past Events:
- Look in the
calendar.llm_vision_events
entity for timestamps, summary text, etc. - Ask Home Assistant Assist or your conversation integration about past events (e.g., “Show me the LLM Vision events from last night”).
- Look in the
4. Core Actions (Services)
LLM Vision provides four main actions (previously called services). You can call them via:
- Developer Tools > Services in Home Assistant
- Automations (YAML or UI)
- Blueprints
Each action takes parameters to tailor the analysis. Some parameters are unique to certain actions.
4.1 Image Analyzer
Service: llmvision.image_analyzer
- Analyzes local image files (
image_file
) or camera entities (image_entity
). - E.g., you can ask “What is in this image?” or “Is there a dog in the driveway?”
Minimal YAML Example:
action: llmvision.image_analyzer
data:
provider: YOUR_PROVIDER_ID
message: "Describe the image"
image_entity: camera.front_door
include_filename: true
max_tokens: 50
temperature: 0.2
You can mix multiple files and entities:
action: llmvision.image_analyzer
data:
model: gpt-4o-mini
message: "What's going on in these images?"
image_file: |-
/config/www/tmp/front_door.jpg
/config/www/tmp/garage.jpg
image_entity:
- camera.front_door
- camera.garage
detail: low
max_tokens: 100
provider: YOUR_PROVIDER_ID
temperature: 0.2
4.2 Video Analyzer
Service: llmvision.video_analyzer
- Analyzes local video files or Frigate event IDs.
- Useful for short clips or security recordings.
Example:
action: llmvision.video_analyzer
data:
provider: YOUR_PROVIDER_ID
message: "What is happening in the video?"
model: gpt-4o-mini
max_tokens: 100
video_file: |-
/config/www/tmp/front_door.mp4
/config/www/tmp/garage.mp4
event_id: 1712108310.968815-r28cdt # A Frigate event
max_frames: 5
detail: low
target_width: 1280
temperature: 0.3
include_filename: true
4.3 Stream Analyzer
Service: llmvision.stream_analyzer
- Captures a live camera stream for a specified
duration
(seconds), picks the most relevant frames, and analyzes them.
Example:
service: llmvision.stream_analyzer
data:
provider: YOUR_PROVIDER_ID
model: gpt-4o-mini
message: "What is happening around the house?"
image_entity: |-
camera.front_door
camera.garage
duration: 10
max_frames: 5
target_width: 1280
detail: low
max_tokens: 100
temperature: 0.5
include_filename: true
4.4 Data Analyzer
Service: llmvision.data_analyzer
- Uses AI to extract data (e.g., a number or text) from an image/chart. Then updates a sensor or helper entity (
sensor_entity
). - Great for reading license plates, counting items, or extracting numeric data from charts.
Example:
service: llmvision.data_analyzer
data:
provider: YOUR_PROVIDER_ID
model: gpt-4o-mini
message: "What is the car's license plate?"
sensor_entity: input_text.last_license_plate
image_entity:
- camera.garage_car
image_file: "/config/www/weather_chart.jpg"
max_tokens: 5
target_width: 1280
detail: high
temperature: 0.1
include_filename: true
5. Common Action Parameters
Below is a quick reference of parameters that can apply to any of the actions. Some parameters only apply to certain actions.
Parameter | Used By | Required | Description | Default | Valid Values |
---|---|---|---|---|---|
provider | All | Yes | The AI provider configuration (set up in UI). | – | — |
model | All | No | Model for AI processing. | gpt-4o-mini | Varies by provider |
message | All | Yes | Prompt or question sent to AI. | – | String |
remember | Image, Video, Stream | No | Whether to store the event in LLM Vision calendar. | false | true , false |
sensor_entity | Data | Yes (Data) | The sensor/helper to update with the result (e.g. input_text.my_value ). | – | sensor , input_* etc. |
image_file | Image | No* | Path(s) to local image(s), one per line. | – | Valid file path(s) |
image_entity | Image, Stream | No* | One or more camera entities. Alternative to image_file . | – | e.g., camera.front_door |
video_file | Video | No* | Path(s) to local video file(s), one per line. | – | Valid file path(s) |
event_id | Video | No* | One or more Frigate event IDs. Each ID on its own line. | – | e.g., 16981733.3429-abc |
max_frames | Video, Stream | No | How many frames to analyze from the video/stream. | 3 | 1–10 (though up to 60 in some setups) |
duration | Video (if streaming), Stream | Yes (Stream) | How many seconds to capture live stream before analyzing. | 10 | 1–300 |
include_filename | All | Yes | If true , includes the file/camera name in the AI prompt. | false | true , false |
target_width | All | No | Downscale image frames to this width before encoding. | 1280 | 512–3840 |
detail | All (OpenAI only) | No | “Level of detail” for image analysis. | auto or low | auto , low , high |
max_tokens | All | Yes | Max tokens in the AI response. | 100 | 10–1000 |
temperature | All | Yes | “Creativity” or randomness in the AI response. | 0.5 | 0.0–1.0 |
expose_images | Image, Video, Stream | No | If true , saves frames to /www/llmvision for external access. | false | true , false |
Note:
- Parameters marked “No*” may be required depending on your use case. For instance,
image_file
orimage_entity
is needed forimage_analyzer
.
6. Asking About Events
Once you have remember: true
in your calls and have set up LLM Vision Events, you can:
- View them in
calendar.llm_vision_events
. - Query them using Home Assistant Assist or another conversation integration.
- Example conversation integration snippet:
spec: name: get_security_events description: Use this function to get list of security events captured by cameras around the house. parameters: type: object properties: start_date_time: type: string description: The start datetime in '%Y-%m-%dT%H:%M:%S%z' format end_date_time: type: string description: The end datetime in '%Y-%m-%dT%H:%M:%S%z' format required: - start_date_time - end_date_time function: type: script sequence: - service: calendar.get_events data: start_date_time: "{{start_date_time}}" end_date_time: "{{end_date_time}}" target: entity_id: calendar.llm_vision_events response_variable: _function_result
7. Blueprint Example: AI Event Summary (v1.3.1)
An example blueprint that demonstrates how LLM Vision can be integrated for security event notifications is the AI Event Summary (LLM Vision v1.3.1) by valentinfrlch
. It uses LLM Vision to:
- Trigger on either a Frigate event or a camera’s motion detection.
- Classify the event importance (optional).
- Analyze with
video_analyzer
orstream_analyzer
. - Send push notifications to your phone with updated text when the AI summary is ready.
- Store events (if
remember: true
).
7.1 Installation Steps for the Blueprint
- Go to Settings > Automations & Scenes in Home Assistant.
- Click Blueprints at the top.
- Click Import Blueprint and use the URL:rubyCopy code
https://github.com/valentinfrlch/ha-llmvision/blob/main/blueprints/event_summary.yaml
- Create a new automation from this imported blueprint and configure the input fields:
- Mode: “Frigate” or “Camera.”
- Provider: Your LLM Vision provider.
- Camera Entities: The camera(s) you want to monitor.
- Prompt: A short phrase telling the AI how to summarize the event.
- Other parameters (cooldown, detail level, max tokens, etc.).
7.2 How It Works
- Trigger: The blueprint listens for either Frigate end-of-event MQTT messages or camera state changes.
- (Optional) AI Classification: If you enable “Important (Beta),” it first classifies the event as passive, time-sensitive, or critical.
- Initial Notification: Sends a push notification to your mobile devices with either a snapshot or live preview.
- LLM Vision Analysis:
- Frigate: Calls
video_analyzer
on the Frigate clip. - Camera: Calls
stream_analyzer
to capture a few seconds of live feed.
- Frigate: Calls
- Final Notification: Updates (or re-sends) the push notification with the AI-generated summary.
- Remember: If set, stores the event summary in LLM Vision’s event calendar for future queries.
8. Tips & Best Practices
- Provider Setup: Ensure you have at least one functioning AI provider (e.g., OpenAI).
- Model Selection: For higher accuracy, choose advanced models if available. Otherwise,
gpt-4o-mini
is the default. - Security:
- If
expose_images
is set totrue
, frames are stored at/www/llmvision
. Make sure your Home Assistant instance is secured.
- If
- Automation:
- Integrate with motion sensors, Frigate object detections, or other triggers to automate calls to LLM Vision.
- E.g., automatically analyze a camera feed whenever Frigate detects a person.
- max_tokens & temperature: Fine-tune these to control the length and creativity of the AI’s response. Lower temperature yields more straightforward answers.
- Event Overload: If you get too many analyses, consider using a higher
cooldown
period or refining your triggers (e.g., only run on specific objects in Frigate).
9. Troubleshooting
- No Providers?
- Go to LLM Vision’s integration settings and add a provider. You can’t proceed without one.
- No AI Results?
- Check the Home Assistant logs for errors from LLM Vision.
- Verify your provider API key or local model is functioning.
- No Notifications?
- Ensure the mobile app is set up correctly (see “Notify Device” in your blueprint or automation).
- Some Android/iOS settings or DND modes can block notifications.
- High Token Usage
- If you’re seeing large usage in logs or AI usage stats, reduce
target_width
ormax_tokens
.
- If you’re seeing large usage in logs or AI usage stats, reduce
- Events Not Remembered
- Confirm you set
remember: true
and have the LLM Vision Events integration configured.
- Confirm you set
10. Additional Resources
- LLM Vision GitBook:
LLM Vision Documentation for examples, best practices, and advanced configurations. - Choosing the Right Model:
Choosing the Right Model - Community Blueprints:
LLM Vision Examples for more advanced or specialized automations.
Final Thoughts
LLM Vision transforms Home Assistant into a powerful AI-driven automation hub, providing visual intelligence for everything from security monitoring to data extraction. By following the steps above, you can configure robust automations that notify you of critical events, store them for future reference, and keep your home safer and smarter.