back to top
Saturday, December 28, 2024
spot_img

LLM Vision: A Comprehensive Guide

Share This Post

1. Overview

LLM Vision is a Home Assistant add-on that uses AI to analyze images, videos, and live camera streams. It can detect objects, summarize activities, and even store past events in an event calendar. Here’s what it can do for you:

  • Image Analysis (e.g., analyzing a snapshot or a camera entity).
  • Video Analysis (e.g., analyzing video files or Frigate events).
  • Stream Analysis (e.g., recording a live camera feed for a few seconds).
  • Data Analysis (e.g., extracting values from images/charts to update sensors).
  • Remembering Events (saving analysis results into a calendar entity for later review).

Combine LLM Vision with your security cameras, motion sensors, or Frigate events to receive detailed AI-generated summaries and notifications.


2. Installation & Configuration

  1. Install the LLM Vision Add-on
    • Go to Settings > Add-ons in Home Assistant.
    • Search for LLM Vision and install it.
  2. Set Up LLM Vision in Integrations
    • Go to Settings > Devices & Services (Integrations tab).
    • Click Add Integration and find LLM Vision.
    • If prompted, choose Event Calendar to store events.
    • Select how long events should be retained. (Set 0 to never auto-delete.)
  3. Configure an AI Provider
    • In Settings > Devices & Services, find LLM Vision and open its configuration.
    • Add or configure at least one AI provider (e.g., OpenAI, local large language model, etc.).
    • Note the provider ID (or simply select it from the UI dropdown when calling LLM Vision actions).
  4. Optional: LLM Vision Events
    • If you enable this, a new calendar entity (usually calendar.llm_vision_events) is created.
    • Any analysis called with remember: true will store an entry here.

3. Remembering Events (Optional)

  • Enabling: If you have configured LLM Vision Events, simply pass remember: true in any image_analyzer, video_analyzer, or stream_analyzer call.
  • Viewing Past Events:
    • Look in the calendar.llm_vision_events entity for timestamps, summary text, etc.
    • Ask Home Assistant Assist or your conversation integration about past events (e.g., “Show me the LLM Vision events from last night”).

4. Core Actions (Services)

LLM Vision provides four main actions (previously called services). You can call them via:

  • Developer Tools > Services in Home Assistant
  • Automations (YAML or UI)
  • Blueprints

Each action takes parameters to tailor the analysis. Some parameters are unique to certain actions.

4.1 Image Analyzer

Service: llmvision.image_analyzer

  • Analyzes local image files (image_file) or camera entities (image_entity).
  • E.g., you can ask “What is in this image?” or “Is there a dog in the driveway?”

Minimal YAML Example:

action: llmvision.image_analyzer
data:
provider: YOUR_PROVIDER_ID
message: "Describe the image"
image_entity: camera.front_door
include_filename: true
max_tokens: 50
temperature: 0.2

You can mix multiple files and entities:

action: llmvision.image_analyzer
data:
model: gpt-4o-mini
message: "What's going on in these images?"
image_file: |-
/config/www/tmp/front_door.jpg
/config/www/tmp/garage.jpg
image_entity:
- camera.front_door
- camera.garage
detail: low
max_tokens: 100
provider: YOUR_PROVIDER_ID
temperature: 0.2

4.2 Video Analyzer

Service: llmvision.video_analyzer

  • Analyzes local video files or Frigate event IDs.
  • Useful for short clips or security recordings.

Example:

action: llmvision.video_analyzer
data:
provider: YOUR_PROVIDER_ID
message: "What is happening in the video?"
model: gpt-4o-mini
max_tokens: 100
video_file: |-
/config/www/tmp/front_door.mp4
/config/www/tmp/garage.mp4
event_id: 1712108310.968815-r28cdt # A Frigate event
max_frames: 5
detail: low
target_width: 1280
temperature: 0.3
include_filename: true

4.3 Stream Analyzer

Service: llmvision.stream_analyzer

  • Captures a live camera stream for a specified duration (seconds), picks the most relevant frames, and analyzes them.

Example:

service: llmvision.stream_analyzer
data:
provider: YOUR_PROVIDER_ID
model: gpt-4o-mini
message: "What is happening around the house?"
image_entity: |-
camera.front_door
camera.garage
duration: 10
max_frames: 5
target_width: 1280
detail: low
max_tokens: 100
temperature: 0.5
include_filename: true

4.4 Data Analyzer

Service: llmvision.data_analyzer

  • Uses AI to extract data (e.g., a number or text) from an image/chart. Then updates a sensor or helper entity (sensor_entity).
  • Great for reading license plates, counting items, or extracting numeric data from charts.

Example:

service: llmvision.data_analyzer
data:
provider: YOUR_PROVIDER_ID
model: gpt-4o-mini
message: "What is the car's license plate?"
sensor_entity: input_text.last_license_plate
image_entity:
- camera.garage_car
image_file: "/config/www/weather_chart.jpg"
max_tokens: 5
target_width: 1280
detail: high
temperature: 0.1
include_filename: true

5. Common Action Parameters

Below is a quick reference of parameters that can apply to any of the actions. Some parameters only apply to certain actions.

ParameterUsed ByRequiredDescriptionDefaultValid Values
providerAllYesThe AI provider configuration (set up in UI).
modelAllNoModel for AI processing.gpt-4o-miniVaries by provider
messageAllYesPrompt or question sent to AI.String
rememberImage, Video, StreamNoWhether to store the event in LLM Vision calendar.falsetrue, false
sensor_entityDataYes (Data)The sensor/helper to update with the result (e.g. input_text.my_value).sensor, input_* etc.
image_fileImageNo*Path(s) to local image(s), one per line.Valid file path(s)
image_entityImage, StreamNo*One or more camera entities. Alternative to image_file.e.g., camera.front_door
video_fileVideoNo*Path(s) to local video file(s), one per line.Valid file path(s)
event_idVideoNo*One or more Frigate event IDs. Each ID on its own line.e.g., 16981733.3429-abc
max_framesVideo, StreamNoHow many frames to analyze from the video/stream.31–10 (though up to 60 in some setups)
durationVideo (if streaming), StreamYes (Stream)How many seconds to capture live stream before analyzing.101–300
include_filenameAllYesIf true, includes the file/camera name in the AI prompt.falsetrue, false
target_widthAllNoDownscale image frames to this width before encoding.1280512–3840
detailAll (OpenAI only)No“Level of detail” for image analysis.auto or lowauto, low, high
max_tokensAllYesMax tokens in the AI response.10010–1000
temperatureAllYes“Creativity” or randomness in the AI response.0.50.0–1.0
expose_imagesImage, Video, StreamNoIf true, saves frames to /www/llmvision for external access.falsetrue, false

Note:

  • Parameters marked “No*” may be required depending on your use case. For instance, image_file or image_entity is needed for image_analyzer.

6. Asking About Events

Once you have remember: true in your calls and have set up LLM Vision Events, you can:

  • View them in calendar.llm_vision_events.
  • Query them using Home Assistant Assist or another conversation integration.
  • Example conversation integration snippet:spec: name: get_security_events description: Use this function to get list of security events captured by cameras around the house. parameters: type: object properties: start_date_time: type: string description: The start datetime in '%Y-%m-%dT%H:%M:%S%z' format end_date_time: type: string description: The end datetime in '%Y-%m-%dT%H:%M:%S%z' format required: - start_date_time - end_date_time function: type: script sequence: - service: calendar.get_events data: start_date_time: "{{start_date_time}}" end_date_time: "{{end_date_time}}" target: entity_id: calendar.llm_vision_events response_variable: _function_result

7. Blueprint Example: AI Event Summary (v1.3.1)

An example blueprint that demonstrates how LLM Vision can be integrated for security event notifications is the AI Event Summary (LLM Vision v1.3.1) by valentinfrlch. It uses LLM Vision to:

  1. Trigger on either a Frigate event or a camera’s motion detection.
  2. Classify the event importance (optional).
  3. Analyze with video_analyzer or stream_analyzer.
  4. Send push notifications to your phone with updated text when the AI summary is ready.
  5. Store events (if remember: true).

7.1 Installation Steps for the Blueprint

  1. Go to Settings > Automations & Scenes in Home Assistant.
  2. Click Blueprints at the top.
  3. Click Import Blueprint and use the URL:rubyCopy codehttps://github.com/valentinfrlch/ha-llmvision/blob/main/blueprints/event_summary.yaml
  4. Create a new automation from this imported blueprint and configure the input fields:
    • Mode: “Frigate” or “Camera.”
    • Provider: Your LLM Vision provider.
    • Camera Entities: The camera(s) you want to monitor.
    • Prompt: A short phrase telling the AI how to summarize the event.
    • Other parameters (cooldown, detail level, max tokens, etc.).

7.2 How It Works

  • Trigger: The blueprint listens for either Frigate end-of-event MQTT messages or camera state changes.
  • (Optional) AI Classification: If you enable “Important (Beta),” it first classifies the event as passive, time-sensitive, or critical.
  • Initial Notification: Sends a push notification to your mobile devices with either a snapshot or live preview.
  • LLM Vision Analysis:
    • Frigate: Calls video_analyzer on the Frigate clip.
    • Camera: Calls stream_analyzer to capture a few seconds of live feed.
  • Final Notification: Updates (or re-sends) the push notification with the AI-generated summary.
  • Remember: If set, stores the event summary in LLM Vision’s event calendar for future queries.

8. Tips & Best Practices

  1. Provider Setup: Ensure you have at least one functioning AI provider (e.g., OpenAI).
  2. Model Selection: For higher accuracy, choose advanced models if available. Otherwise, gpt-4o-mini is the default.
  3. Security:
    • If expose_images is set to true, frames are stored at /www/llmvision. Make sure your Home Assistant instance is secured.
  4. Automation:
    • Integrate with motion sensors, Frigate object detections, or other triggers to automate calls to LLM Vision.
    • E.g., automatically analyze a camera feed whenever Frigate detects a person.
  5. max_tokens & temperature: Fine-tune these to control the length and creativity of the AI’s response. Lower temperature yields more straightforward answers.
  6. Event Overload: If you get too many analyses, consider using a higher cooldown period or refining your triggers (e.g., only run on specific objects in Frigate).

9. Troubleshooting

  • No Providers?
    • Go to LLM Vision’s integration settings and add a provider. You can’t proceed without one.
  • No AI Results?
    • Check the Home Assistant logs for errors from LLM Vision.
    • Verify your provider API key or local model is functioning.
  • No Notifications?
    • Ensure the mobile app is set up correctly (see “Notify Device” in your blueprint or automation).
    • Some Android/iOS settings or DND modes can block notifications.
  • High Token Usage
    • If you’re seeing large usage in logs or AI usage stats, reduce target_width or max_tokens.
  • Events Not Remembered
    • Confirm you set remember: true and have the LLM Vision Events integration configured.

10. Additional Resources


Final Thoughts

LLM Vision transforms Home Assistant into a powerful AI-driven automation hub, providing visual intelligence for everything from security monitoring to data extraction. By following the steps above, you can configure robust automations that notify you of critical events, store them for future reference, and keep your home safer and smarter.

Pilāni
clear sky
20.1 ° C
20.1 °
20.1 °
50 %
4.1kmh
10 %
Sat
20 °
Sun
20 °
Mon
21 °
Tue
21 °
Wed
23 °

Related Posts

Using Seeed Studio mmWave Module with ESPHome

In the ever-expanding universe of smart technology, the fusion...

Raspberry Pi Automatic Fans Using L298n PWM

Welcome, We all know Raspberry Pi SBC Likes to...

MotionEye on Raspberry Pi: Proper Surveillance Software?

Welcome to another Raspberry Pi Post, this time we...

DIY Home Automation: ESP Home & Home Assistant

ESPHome is a powerful tool that simplifies the process...

Raspberry Pi Zero Explained: Comparing the Zero Family

The Raspberry Pi Zero series, known for its compact...

Aliens Guide to Earth’s Solar System

Position 00 - The Sun. Position: #0. The gravitational...
- Advertisement -spot_img