weaviate 31 Q&As

Weaviate FAQ & Answers

31 expert Weaviate answers researched from official documentation. Every answer cites authoritative sources you can verify.

unknown

31 questions
A

Use ContainsAll operator with valueText array: { Get { YourClassName(limit: 10, where: { path: ["tags"], operator: ContainsAll, valueText: ["react", "typescript"] }) { tags title } } }. The ContainsAll operator works on text properties and matches objects where the property contains all values in the array. For tokenized text fields, ContainsAll treats text as token array based on tokenization scheme. Alternative: use multiple where clauses with AND operator for separate conditions. ContainsAny matches any value, ContainsAll matches all values. Weaviate 1.25 introduced Raft consensus - schema filtering behavior unchanged from previous versions. Python client example: client.query.get('YourClass', ['tags']).with_where({'path': ['tags'], 'operator': 'ContainsAll', 'valueText': ['react', 'typescript']}).do()

99% confidence
A

Increase alpha parameter toward 1.0 to prioritize vector search: query = client.query.get('Article', ['title', 'content']).with_hybrid(query='search text', alpha=0.75).with_limit(10).do(). Alpha values: 1.0=pure vector search, 0.0=pure keyword (BM25), 0.5=equal weighting (default). For better semantic results, use alpha=0.7-0.9. Example: alpha=0.75 gives 75% weight to vector search, 25% to BM25. Weaviate v1.24+ uses relativeScoreFusion (default), v1.20-1.23 used rankedFusion. Choose fusion algorithm with fusionType parameter. Test different alpha values on validation set to find optimal balance. GraphQL: hybrid(query: "text", alpha: 0.75). Adjust based on use case: higher alpha for semantic similarity, lower for exact keyword matching. Monitor result relevance to tune alpha per query type.

99% confidence
A

Common causes: (1) Low HNSW parameters - under-configured M/efConstruction drops NDCG@10 by up to 18%. Default maxConnections=32, efConstruction=128. (2) Restrictive post-filtering - dramatic recall drops with strict filters. Use pre-filtering instead. (3) Vector compression - pure PQ implementation causes high recall degradation. (4) Reduced maxConnections - adversely affects recall. Solutions: Increase efConstruction (affects import time only) and/or ef (affects query time) parameters. For filtering: enable pre-filtering to maintain recall. For compression: use RQ (default in v1.33+) with 98-99% native recall, or 1-bit RQ for near-32x compression. Check PQ segments parameter - larger segments = higher memory + recall. Increase maxConnections parameter if recall remains low (each connection uses 8-10B memory). Monitor NDCG@10 metric to measure recall quality. Default parameters sufficient for most workloads - only tune if experiencing issues.

99% confidence
A

Update class schema vectorizer config with correct dimensions before migration: client.schema.update_config(class_name='Article', config={'vectorizer': 'text2vec-openai', 'moduleConfig': {'text2vec-openai': {'model': 'text-embedding-3-large', 'dimensions': 1024}}}). OpenAI's text-embedding-3-large supports dimension reduction (default 3072, configurable to 256-3072). Common error: 'vector lengths don't match: 1024 vs 3072' during hybrid/vector searches. Migration steps: (1) Create new class with updated schema, (2) Re-vectorize all objects with new dimension setting, (3) Delete old class after verification. Cannot change dimensions on existing class - must recreate. For zero-downtime: use collection aliasing during migration. Verify schema: client.schema.get('Article')['vectorizer'] shows current config. Test with sample object before bulk migration. Dimension mismatch causes runtime errors not schema validation errors.

99% confidence
A

Enable text2vec-openai module in Weaviate instance config and set vectorizeClassName: true in schema. Common errors: 'vectorizer: no module with name text2vec-openai present' - verify ENABLE_MODULES=text2vec-openai in docker-compose or Kubernetes config. For null property values: Weaviate automatically skips null/missing properties during vectorization (since v1.16) - both null values and missing properties are skipped. To explicitly control: set skip: true in moduleConfig. Example schema: 'properties': [{'name': 'description', 'dataType': ['text'], 'moduleConfig': {'text2vec-openai': {'skip': true}}}]. Restart Weaviate after enabling modules. For Kubernetes: add text2vec-openai to modules list in Helm values.yaml. Third-party OpenAI APIs: check endpoint compatibility - module hardcoded for /v1/embeddings path. Verify API key set: X-OpenAI-Api-Key header or OPENAI_APIKEY env var. For nearText queries: requires vectorizer enabled on class. Check module status: GET /v1/.well-known/ready endpoint.

99% confidence
A

Liveness probe: GET http://localhost:8080/v1/.well-known/live - returns HTTP 200 if application can respond to requests. Readiness probe: GET http://localhost:8080/v1/.well-known/ready - returns HTTP 200 if ready to serve traffic, HTTP 503 if unable. Kubernetes probe config: livenessProbe: {httpGet: {path: /v1/.well-known/live, port: 8080}, initialDelaySeconds: 120, periodSeconds: 3, timeoutSeconds: 3}. Readiness uses same structure with /ready endpoint. Weaviate Helm chart defaults: initialDelaySeconds=120s, periodSeconds=3s, timeoutSeconds=3s. Alternative exec probe: wget --no-verbose --tries=1 --spider http://localhost:8080/v1/.well-known/ready. Test manually: curl http://weaviate-service:8080/v1/.well-known/live. Both probes required for production - liveness restarts unhealthy pods, readiness controls traffic routing. Adjust initialDelaySeconds based on startup time with large datasets.

99% confidence
A

Enable multi-tenancy at class creation with multiTenancyConfig: client.schema.create_class({'class': 'Article', 'multiTenancyConfig': {'enabled': True}, 'properties': [{'name': 'title', 'dataType': ['text']}]}). Each tenant gets isolated shard - data never mixes. Create tenant: client.schema.add_class_tenants(class_name='Article', tenants=[Tenant(name='tenant_a')]). Query tenant data: client.query.get('Article', ['title']).with_tenant('tenant_a').do(). Supports 50,000 active tenants per node (Weaviate v1.25+). Tenant states: ACTIVE (in memory), INACTIVE (offloaded to disk), OFFLOADED (removed from disk). Change state: client.schema.update_class_tenants(class_name='Article', tenants=[Tenant(name='tenant_a', activityStatus='INACTIVE')]). INACTIVE tenants auto-load on query (slower first query). Use for SaaS apps with tenant isolation requirements. Cannot disable multi-tenancy after creation - must recreate class.

99% confidence
A

Use backup API with S3 backend: POST /v1/backups/s3 with body {'id': 'backup-2025-01', 'include': ['Article', 'Product']}. Full example: curl -X POST http://localhost:8080/v1/backups/s3 -H 'Content-Type: application/json' -d '{"id": "backup-2025-01", "include": ["Article", "Product"]}'. Configure S3 in docker-compose: BACKUP_S3_BUCKET=my-backups, BACKUP_S3_ENDPOINT=s3.amazonaws.com, AWS_ACCESS_KEY_ID=key, AWS_SECRET_ACCESS_KEY=secret. Restore: POST /v1/backups/s3/backup-2025-01/restore with {'include': ['Article']}. Python client: client.backup.create(backup_id='backup-2025-01', backend='s3', include_classes=['Article']). Backup includes: schema, data, inverted index. Excludes: queue state, in-flight operations. Backups are full snapshots (no incremental). For automation: use Kubernetes CronJob with Weaviate API. Alternative backends: filesystem, GCS, Azure. Verify backup: GET /v1/backups/s3/backup-2025-01.

99% confidence
A

Set replicationFactor at class creation: client.schema.create_class({'class': 'Article', 'replicationConfig': {'factor': 3}, 'properties': [{'name': 'title', 'dataType': ['text']}]}). Requires minimum 3 Weaviate nodes in cluster. Each object replicates to 3 nodes - survives 2 node failures. Query with consistency level: client.query.get('Article', ['title']).with_consistency_level('QUORUM').do(). Consistency levels: ONE (fastest, eventual consistency), QUORUM (majority, default), ALL (slowest, strongest consistency). Writes succeed when QUORUM nodes acknowledge (2 of 3). Replication uses Raft consensus (v1.25+). Repair reads: ONE queries auto-repair stale replicas in background. For write-heavy: use ONE with async repair. For read-heavy: use QUORUM or ALL. Monitor lag: /v1/schema/{class}/shards endpoint shows replication status. Cannot change factor after creation - requires class recreation with migration.

99% confidence
A

Enable generative-cohere module and add to class schema: {'class': 'Article', 'moduleConfig': {'generative-cohere': {'model': 'command'}}, 'properties': [{'name': 'content', 'dataType': ['text']}]}. Query with single prompt: client.query.get('Article', ['title', 'content']).with_near_text({'concepts': ['AI']}).with_generate(single_prompt='Summarize this in 50 words: {content}').with_limit(3).do(). Response includes _additional.generate.singleResult with generated text. For grouped prompt (all results): with_generate(grouped_task='Create a report from these articles'). Set API key: X-Cohere-Api-Key header or COHERE_APIKEY env var. Models: command (default), command-light (faster). Temperature control: with_generate(single_prompt='...', grouped_properties=['content']). Cost: per-token Cohere pricing + vector search. Alternative modules: generative-openai, generative-aws (Bedrock). Use cases: summarization, Q&A, content generation from search results.

99% confidence
A

Use server-side batching (v1.33+, recommended): Server automatically manages optimal batch size via dynamic backpressure - monitors queue size, calculates EMA of workload, adjusts chunk size dynamically. Python v4 client: with client.batch.dynamic() as batch: for obj in objects: batch.add_object(properties=obj, collection='Article'). Server applies backpressure when busy to prevent timeouts. For v3 client/older: use client.batch.configure(batch_size=100, dynamic=True, timeout_retries=3). Optimization: (1) Disable indexing during import: vectorIndexConfig={'skip': True}, re-enable after with client.schema.update_config(). (2) Batch size 50-200 for manual batching. (3) Multi-threading: use num_workers parameter (v3.9.0+). (4) Pre-generate vectors: set vector manually to skip vectorizer. Expected throughput: 10,000-50,000 objects/sec (hardware/vectorizer dependent). Monitor: /v1/schema/{class}/shards for progress. For large imports: increase QUERY_MAXIMUM_RESULTS env var. Post-import: run consistency check, rebuild index if skip=True used. Server-side batching provides good performance for thousands to millions of objects.

99% confidence
A

Define cross-reference property in schema: {'class': 'Article', 'properties': [{'name': 'author', 'dataType': ['Author']}]}. DataType is target class name (capitalized). Add reference during object creation: client.data_object.create(data_object={'title': 'My Article'}, class_name='Article', uuid='article-uuid'). Add reference after creation: client.data_object.reference.add(from_uuid='article-uuid', from_property_name='author', to_uuid='author-uuid', from_class_name='Article', to_class_name='Author'). Query with cross-references: query = client.query.get('Article', ['title', 'author {...on Author { name }}']). GraphQL nested syntax retrieves referenced objects. Delete reference: client.data_object.reference.delete(). Update reference: delete old + add new. Circular references allowed (Article → Author → Article). Cardinality: one-to-one (single dataType), one-to-many (array of refs). Cross-references stored as beacon format: weaviate://localhost/Author/uuid.

99% confidence
A

Define named vectors in class schema (v1.24+): {'class': 'Article', 'vectorConfig': {'title_vector': {'vectorizer': {'text2vec-openai': {'model': 'text-embedding-3-small'}}, 'vectorIndexType': 'hnsw'}, 'content_vector': {'vectorizer': {'text2vec-cohere': {'model': 'embed-multilingual-v3.0'}}, 'vectorIndexType': 'hnsw'}}}. Each named vector has independent vectorizer and index. Query specific vector: client.query.get('Article', ['title']).with_near_text({'concepts': ['AI']}, target_vector='title_vector').do(). Insert with custom vectors: client.data_object.create(data_object={'title': 'Article'}, class_name='Article', vector={'title_vector': [0.1, 0.2, ...], 'content_vector': [0.3, 0.4, ...]}). Use cases: (1) Multi-language embeddings, (2) Title vs content search, (3) Different embedding models per field. Hybrid search: specify target vector in hybrid query. Performance: each vector adds storage + compute cost. Default vector name: '_default' if no named vectors defined.

99% confidence
A

Configure vectorIndexConfig for fast queries: {'vectorIndexConfig': {'efConstruction': 128, 'maxConnections': 64, 'ef': 64, 'dynamicEfMin': 64, 'dynamicEfMax': 512, 'dynamicEfFactor': 8}}. efConstruction=128: slower imports (2x time) but better recall. maxConnections=64: more memory (2x default) but faster queries. ef=64: higher query accuracy. dynamicEf: auto-adjusts ef based on limit (limit=10 → ef=80). Trade-offs: efConstruction only affects import (safe to increase), maxConnections increases memory 32MB per 100K vectors (M=64 vs M=32), ef increases query latency but improves recall. For production with frequent queries: use shown config. For write-heavy: reduce efConstruction=64, maxConnections=32. Monitor query performance: enable PROMETHEUS_MONITORING_ENABLED=true, check weaviate_vector_index_size and weaviate_queries_duration metrics. Defaults (M=16, ef=64) sufficient for most workloads.

99% confidence
A

Enable OIDC or API key auth in docker-compose: AUTHENTICATION_APIKEY_ENABLED=true, AUTHENTICATION_APIKEY_ALLOWED_KEYS=admin-key,readonly-key, AUTHENTICATION_APIKEY_USERS=admin,user. Configure authorization: AUTHORIZATION_ADMINLIST_ENABLED=true, AUTHORIZATION_ADMINLIST_USERS=admin. Client authentication: client = weaviate.Client(url='http://localhost:8080', auth_client_secret=weaviate.AuthApiKey('admin-key')). Request header: Authorization: Bearer admin-key. For OIDC (production): AUTHENTICATION_OIDC_ENABLED=true, AUTHENTICATION_OIDC_ISSUER=https://auth.example.com, AUTHENTICATION_OIDC_CLIENT_ID=weaviate. OIDC tokens: use JWT from identity provider. Both API key and OIDC can be enabled simultaneously. Anonymous access: AUTHENTICATION_ANONYMOUS_ACCESS_ENABLED=true (default, disable in production). For fine-grained control: use RBAC (v1.29+ GA, released Feb 2025) - define roles with specific permissions (collections, data ops, backups, users, tenants). Must specify at least one root user if RBAC enabled. Test auth: curl -H 'Authorization: Bearer admin-key' http://localhost:8080/v1/schema returns 200, invalid key returns 401.

99% confidence
A

Supported dataTypes and operators: (1) text/string: Equal, NotEqual, Like (wildcard), ContainsAll, ContainsAny. (2) int/number: Equal, NotEqual, GreaterThan, GreaterThanEqual, LessThan, LessThanEqual. (3) boolean: Equal, NotEqual. (4) date: Equal, NotEqual, GreaterThan, LessThan (ISO 8601 format). (5) geoCoordinates: WithinGeoRange (latitude, longitude, distance). (6) phoneNumber: Equal. (7) uuid: Equal, NotEqual. (8) blob: not filterable. Example: client.query.get('Article', ['title']).with_where({'path': ['views'], 'operator': 'GreaterThan', 'valueInt': 1000}).do(). Array properties: use ContainsAny (match any value) or ContainsAll (match all values). Combine operators with AND/OR: {'operator': 'And', 'operands': [{...}, {...}]}. Case sensitivity: text is case-insensitive by default, configure with 'tokenization': 'field' for case-sensitive. For cross-reference filtering: nest path like ['author', 'Author', 'name'].

99% confidence
A

Enable Prometheus metrics: PROMETHEUS_MONITORING_ENABLED=true (default port 2112). Key metrics: (1) weaviate_batch_durations_ms: import performance. (2) weaviate_queries_duration_ms: query latency. (3) weaviate_objects_total: object count per class. (4) weaviate_vector_index_size: memory usage. (5) weaviate_startup_durations_ms: startup time. Scrape endpoint: http://localhost:2112/metrics. Grafana dashboard: official template at https://github.com/weaviate/weaviate/tree/main/tools/grafana. Health endpoints: GET /v1/.well-known/live (liveness), GET /v1/.well-known/ready (readiness), GET /v1/nodes (cluster status). Log levels: LOG_LEVEL=debug for troubleshooting, info for production. Monitor disk usage: check /v1/schema/{class}/shards for shard statistics. Alert on: query latency >500ms, object import failures, node unavailability, disk >80% full. For distributed tracing: integrate with OpenTelemetry (TRACE_ENABLED=true).

99% confidence
A

Set tokenization property config to 'field' for case-sensitive: {'class': 'Article', 'properties': [{'name': 'code', 'dataType': ['text'], 'tokenization': 'field'}]}. Tokenization options: (1) word (default): lowercase, split on spaces/punctuation, stemming. (2) field: case-sensitive, no splitting, exact match. (3) lowercase: lowercase only, no stemming. (4) whitespace: split on whitespace only. (5) trigram: 3-character ngrams for fuzzy search. Use 'field' for: code snippets, IDs, case-sensitive tags. Use 'word' for: natural language, articles, descriptions. Query behavior: with tokenization='field', filter with Equal operator matches exact case. Example: {'path': ['code'], 'operator': 'Equal', 'valueText': 'MyFunction'} matches 'MyFunction' but not 'myfunction'. Cannot change tokenization after creation - requires class recreation. For multilingual: use 'trigram' with language-specific stemming disabled. Check current config: client.schema.get('Article')['properties'][0]['tokenization'].

99% confidence
A

Error occurs when querying with quantization search params but quantization not enabled in schema. Fix: remove quantization search params or enable quantization. Weaviate v1.33+ defaults to 8-bit RQ for new collections. Enable explicitly: Binary quantization: {'vectorIndexConfig': {'quantizer': {'type': 'bq'}}}. Scalar quantization: {'quantizer': {'type': 'sq'}}. Rotational quantization (recommended): {'quantizer': {'type': 'rq', 'training_limit': 100000}} for 98-99% recall. 1-bit RQ (v1.33+): more robust alternative to BQ with near-32x compression. Query without quantization params: remove .with_additional(['distance quantization']) or search_params with quantization config. Check if enabled: GET /v1/schema/Article returns vectorIndexConfig.quantizer. Cannot add quantization to existing class - requires recreation with migration. Change default via DEFAULT_QUANTIZATION env var. Query with quantization: client.query.get('Article', ['title']).with_near_vector({'vector': [...]}).do(). Quantization happens automatically if configured in schema. Each named vector can have independent compression config.

99% confidence
A

Use ContainsAll operator with valueText array: { Get { YourClassName(limit: 10, where: { path: ["tags"], operator: ContainsAll, valueText: ["react", "typescript"] }) { tags title } } }. The ContainsAll operator works on text properties and matches objects where the property contains all values in the array. For tokenized text fields, ContainsAll treats text as token array based on tokenization scheme. Alternative: use multiple where clauses with AND operator for separate conditions. ContainsAny matches any value, ContainsAll matches all values. Weaviate 1.25 introduced Raft consensus - schema filtering behavior unchanged from previous versions. Python client example: client.query.get('YourClass', ['tags']).with_where({'path': ['tags'], 'operator': 'ContainsAll', 'valueText': ['react', 'typescript']}).do()

99% confidence
A

Increase alpha parameter toward 1.0 to prioritize vector search: query = client.query.get('Article', ['title', 'content']).with_hybrid(query='search text', alpha=0.75).with_limit(10).do(). Alpha values: 1.0=pure vector search, 0.0=pure keyword (BM25), 0.5=equal weighting (default). For better semantic results, use alpha=0.7-0.9. Example: alpha=0.75 gives 75% weight to vector search, 25% to BM25. Weaviate v1.24+ uses relativeScoreFusion (default), v1.20-1.23 used rankedFusion. Choose fusion algorithm with fusionType parameter. Test different alpha values on validation set to find optimal balance. GraphQL: hybrid(query: "text", alpha: 0.75). Adjust based on use case: higher alpha for semantic similarity, lower for exact keyword matching. Monitor result relevance to tune alpha per query type.

99% confidence
A

Common causes: (1) Low HNSW parameters - under-configured M/efConstruction drops NDCG@10 by up to 18%. Default maxConnections=32, efConstruction=128. (2) Restrictive post-filtering - dramatic recall drops with strict filters. Use pre-filtering instead. (3) Vector compression - pure PQ implementation causes high recall degradation. (4) Reduced maxConnections - adversely affects recall. Solutions: Increase efConstruction (affects import time only) and/or ef (affects query time) parameters. For filtering: enable pre-filtering to maintain recall. For compression: use RQ (default in v1.33+) with 98-99% native recall, or 1-bit RQ for near-32x compression. Check PQ segments parameter - larger segments = higher memory + recall. Increase maxConnections parameter if recall remains low (each connection uses 8-10B memory). Monitor NDCG@10 metric to measure recall quality. Default parameters sufficient for most workloads - only tune if experiencing issues.

99% confidence
A

Update class schema vectorizer config with correct dimensions before migration: client.schema.update_config(class_name='Article', config={'vectorizer': 'text2vec-openai', 'moduleConfig': {'text2vec-openai': {'model': 'text-embedding-3-large', 'dimensions': 1024}}}). OpenAI's text-embedding-3-large supports dimension reduction (default 3072, configurable to 256-3072). Common error: 'vector lengths don't match: 1024 vs 3072' during hybrid/vector searches. Migration steps: (1) Create new class with updated schema, (2) Re-vectorize all objects with new dimension setting, (3) Delete old class after verification. Cannot change dimensions on existing class - must recreate. For zero-downtime: use collection aliasing during migration. Verify schema: client.schema.get('Article')['vectorizer'] shows current config. Test with sample object before bulk migration. Dimension mismatch causes runtime errors not schema validation errors.

99% confidence
A

Enable text2vec-openai module in Weaviate instance config and set vectorizeClassName: true in schema. Common errors: 'vectorizer: no module with name text2vec-openai present' - verify ENABLE_MODULES=text2vec-openai in docker-compose or Kubernetes config. For null property values: Weaviate automatically skips null/missing properties during vectorization (since v1.16) - both null values and missing properties are skipped. To explicitly control: set skip: true in moduleConfig. Example schema: 'properties': [{'name': 'description', 'dataType': ['text'], 'moduleConfig': {'text2vec-openai': {'skip': true}}}]. Restart Weaviate after enabling modules. For Kubernetes: add text2vec-openai to modules list in Helm values.yaml. Third-party OpenAI APIs: check endpoint compatibility - module hardcoded for /v1/embeddings path. Verify API key set: X-OpenAI-Api-Key header or OPENAI_APIKEY env var. For nearText queries: requires vectorizer enabled on class. Check module status: GET /v1/.well-known/ready endpoint.

99% confidence
A

Liveness probe: GET http://localhost:8080/v1/.well-known/live - returns HTTP 200 if application can respond to requests. Readiness probe: GET http://localhost:8080/v1/.well-known/ready - returns HTTP 200 if ready to serve traffic, HTTP 503 if unable. Kubernetes probe config: livenessProbe: {httpGet: {path: /v1/.well-known/live, port: 8080}, initialDelaySeconds: 120, periodSeconds: 3, timeoutSeconds: 3}. Readiness uses same structure with /ready endpoint. Weaviate Helm chart defaults: initialDelaySeconds=120s, periodSeconds=3s, timeoutSeconds=3s. Alternative exec probe: wget --no-verbose --tries=1 --spider http://localhost:8080/v1/.well-known/ready. Test manually: curl http://weaviate-service:8080/v1/.well-known/live. Both probes required for production - liveness restarts unhealthy pods, readiness controls traffic routing. Adjust initialDelaySeconds based on startup time with large datasets.

99% confidence
A

Enable multi-tenancy at class creation with multiTenancyConfig: client.schema.create_class({'class': 'Article', 'multiTenancyConfig': {'enabled': True}, 'properties': [{'name': 'title', 'dataType': ['text']}]}). Each tenant gets isolated shard - data never mixes. Create tenant: client.schema.add_class_tenants(class_name='Article', tenants=[Tenant(name='tenant_a')]). Query tenant data: client.query.get('Article', ['title']).with_tenant('tenant_a').do(). Supports 50,000 active tenants per node (Weaviate v1.25+). Tenant states: ACTIVE (in memory), INACTIVE (offloaded to disk), OFFLOADED (removed from disk). Change state: client.schema.update_class_tenants(class_name='Article', tenants=[Tenant(name='tenant_a', activityStatus='INACTIVE')]). INACTIVE tenants auto-load on query (slower first query). Use for SaaS apps with tenant isolation requirements. Cannot disable multi-tenancy after creation - must recreate class.

99% confidence
A

Use backup API with S3 backend: POST /v1/backups/s3 with body {'id': 'backup-2025-01', 'include': ['Article', 'Product']}. Full example: curl -X POST http://localhost:8080/v1/backups/s3 -H 'Content-Type: application/json' -d '{"id": "backup-2025-01", "include": ["Article", "Product"]}'. Configure S3 in docker-compose: BACKUP_S3_BUCKET=my-backups, BACKUP_S3_ENDPOINT=s3.amazonaws.com, AWS_ACCESS_KEY_ID=key, AWS_SECRET_ACCESS_KEY=secret. Restore: POST /v1/backups/s3/backup-2025-01/restore with {'include': ['Article']}. Python client: client.backup.create(backup_id='backup-2025-01', backend='s3', include_classes=['Article']). Backup includes: schema, data, inverted index. Excludes: queue state, in-flight operations. Backups are full snapshots (no incremental). For automation: use Kubernetes CronJob with Weaviate API. Alternative backends: filesystem, GCS, Azure. Verify backup: GET /v1/backups/s3/backup-2025-01.

99% confidence
A

Set replicationFactor at class creation: client.schema.create_class({'class': 'Article', 'replicationConfig': {'factor': 3}, 'properties': [{'name': 'title', 'dataType': ['text']}]}). Requires minimum 3 Weaviate nodes in cluster. Each object replicates to 3 nodes - survives 2 node failures. Query with consistency level: client.query.get('Article', ['title']).with_consistency_level('QUORUM').do(). Consistency levels: ONE (fastest, eventual consistency), QUORUM (majority, default), ALL (slowest, strongest consistency). Writes succeed when QUORUM nodes acknowledge (2 of 3). Replication uses Raft consensus (v1.25+). Repair reads: ONE queries auto-repair stale replicas in background. For write-heavy: use ONE with async repair. For read-heavy: use QUORUM or ALL. Monitor lag: /v1/schema/{class}/shards endpoint shows replication status. Cannot change factor after creation - requires class recreation with migration.

99% confidence
A

Enable generative-cohere module and add to class schema: {'class': 'Article', 'moduleConfig': {'generative-cohere': {'model': 'command'}}, 'properties': [{'name': 'content', 'dataType': ['text']}]}. Query with single prompt: client.query.get('Article', ['title', 'content']).with_near_text({'concepts': ['AI']}).with_generate(single_prompt='Summarize this in 50 words: {content}').with_limit(3).do(). Response includes _additional.generate.singleResult with generated text. For grouped prompt (all results): with_generate(grouped_task='Create a report from these articles'). Set API key: X-Cohere-Api-Key header or COHERE_APIKEY env var. Models: command (default), command-light (faster). Temperature control: with_generate(single_prompt='...', grouped_properties=['content']). Cost: per-token Cohere pricing + vector search. Alternative modules: generative-openai, generative-aws (Bedrock). Use cases: summarization, Q&A, content generation from search results.

99% confidence
A

Use server-side batching (v1.33+, recommended): Server automatically manages optimal batch size via dynamic backpressure - monitors queue size, calculates EMA of workload, adjusts chunk size dynamically. Python v4 client: with client.batch.dynamic() as batch: for obj in objects: batch.add_object(properties=obj, collection='Article'). Server applies backpressure when busy to prevent timeouts. For v3 client/older: use client.batch.configure(batch_size=100, dynamic=True, timeout_retries=3). Optimization: (1) Disable indexing during import: vectorIndexConfig={'skip': True}, re-enable after with client.schema.update_config(). (2) Batch size 50-200 for manual batching. (3) Multi-threading: use num_workers parameter (v3.9.0+). (4) Pre-generate vectors: set vector manually to skip vectorizer. Expected throughput: 10,000-50,000 objects/sec (hardware/vectorizer dependent). Monitor: /v1/schema/{class}/shards for progress. For large imports: increase QUERY_MAXIMUM_RESULTS env var. Post-import: run consistency check, rebuild index if skip=True used. Server-side batching provides good performance for thousands to millions of objects.

99% confidence