Skip to content

Data Source Configuration

Learn how to connect ElasticView to various data sources including Elasticsearch, databases, and APIs.

Overview

ElasticView supports multiple data source types to provide comprehensive data analysis capabilities:

  • Elasticsearch: Primary search and analytics engine
  • Relational Databases: MySQL, PostgreSQL, SQL Server
  • NoSQL Databases: MongoDB, Redis
  • Time Series Databases: InfluxDB, Prometheus
  • APIs and Web Services: REST APIs, GraphQL
  • File Sources: CSV, JSON, XML files

Elasticsearch Configuration

Basic Connection

  1. Access Data Source Settings

    • Navigate to "System Settings" → "Data Sources"
    • Click "Add Data Source"
    • Select "Elasticsearch"
  2. Connection Parameters

    yaml
    name: "Primary Elasticsearch"
    type: "elasticsearch"
    hosts:
      - "http://localhost:9200"
      - "http://elasticsearch-2:9200"
    timeout: 30s
  3. Authentication

    yaml
    # Basic Authentication
    username: "elastic"
    password: "your-password"
    
    # API Key Authentication
    api_key: "your-api-key"
    
    # SSL/TLS Configuration
    ssl:
      enabled: true
      verify_certificates: true
      ca_cert_path: "/path/to/ca.pem"

Advanced Configuration

Index Patterns

yaml
index_patterns:
  - "logs-*"
  - "metrics-*"
  - "events-2023-*"
default_index: "logs-*"

Query Settings

yaml
query:
  max_concurrent_searches: 10
  timeout: "30s"
  scroll_size: 1000
  max_result_window: 10000

Performance Optimization

yaml
connection_pool:
  max_idle_connections: 10
  max_connections_per_host: 20
  idle_timeout: "90s"
  
cache:
  enabled: true
  ttl: "5m"
  max_size: "100MB"

Database Connections

MySQL Configuration

  1. Connection Setup

    yaml
    name: "MySQL Database"
    type: "mysql"
    host: "localhost"
    port: 3306
    database: "your_database"
    username: "db_user"
    password: "db_password"
  2. Connection Pool

    yaml
    pool:
      max_open_connections: 25
      max_idle_connections: 10
      connection_max_lifetime: "1h"
      connection_max_idle_time: "10m"
  3. SSL Configuration

    yaml
    ssl:
      mode: "require"  # disable, require, verify-ca, verify-full
      ca_cert: "/path/to/ca.pem"
      client_cert: "/path/to/client.pem"
      client_key: "/path/to/client-key.pem"

PostgreSQL Configuration

yaml
name: "PostgreSQL Database"
type: "postgresql"
host: "localhost"
port: 5432
database: "your_database"
username: "db_user"
password: "db_password"
ssl_mode: "require"
timezone: "UTC"

MongoDB Configuration

yaml
name: "MongoDB Database"
type: "mongodb"
uri: "mongodb://localhost:27017/your_database"
# or
host: "localhost"
port: 27017
database: "your_database"
username: "mongo_user"
password: "mongo_password"
auth_database: "admin"

API Data Sources

REST API Configuration

  1. Basic Setup

    yaml
    name: "External API"
    type: "rest_api"
    base_url: "https://api.example.com"
    timeout: "30s"
  2. Authentication

    yaml
    # API Key
    auth:
      type: "api_key"
      key: "your-api-key"
      header: "X-API-Key"
    
    # Bearer Token
    auth:
      type: "bearer"
      token: "your-bearer-token"
    
    # Basic Auth
    auth:
      type: "basic"
      username: "api_user"
      password: "api_password"
  3. Headers and Parameters

    yaml
    headers:
      Content-Type: "application/json"
      Accept: "application/json"
      User-Agent: "ElasticView/1.0"
    
    default_params:
      format: "json"
      limit: 100

GraphQL Configuration

yaml
name: "GraphQL API"
type: "graphql"
endpoint: "https://api.example.com/graphql"
headers:
  Authorization: "Bearer your-token"
  Content-Type: "application/json"

File Data Sources

CSV Files

yaml
name: "CSV Data"
type: "csv"
file_path: "/data/sample.csv"
# or
url: "https://example.com/data.csv"

options:
  delimiter: ","
  header: true
  encoding: "utf-8"
  skip_rows: 0

JSON Files

yaml
name: "JSON Data"
type: "json"
file_path: "/data/sample.json"
options:
  root_path: "$.data"
  flatten: true

Time Series Databases

InfluxDB Configuration

yaml
name: "InfluxDB"
type: "influxdb"
url: "http://localhost:8086"
database: "your_database"
username: "influx_user"
password: "influx_password"
retention_policy: "autogen"

Prometheus Configuration

yaml
name: "Prometheus"
type: "prometheus"
url: "http://localhost:9090"
timeout: "30s"
headers:
  Authorization: "Bearer your-token"

Data Source Management

Testing Connections

  1. Connection Test

    • Use the "Test Connection" button
    • Verify connectivity and authentication
    • Check data accessibility
  2. Query Testing

    sql
    -- Test SQL query
    SELECT COUNT(*) FROM your_table LIMIT 1;
    json
    // Test Elasticsearch query
    {
      "query": {
        "match_all": {}
      },
      "size": 1
    }

Health Monitoring

Monitor data source health:

yaml
health_check:
  enabled: true
  interval: "5m"
  timeout: "10s"
  retry_count: 3
  
alerts:
  - type: "connection_failed"
    notification: "email"
  - type: "slow_response"
    threshold: "30s"

Performance Monitoring

Track data source performance:

  • Response Time: Query execution time
  • Throughput: Queries per second
  • Error Rate: Failed query percentage
  • Connection Pool: Active/idle connections

Security Best Practices

Credential Management

  1. Environment Variables

    bash
    export DB_PASSWORD="secure-password"
    export API_KEY="your-api-key"
  2. Encrypted Storage

    • Use built-in credential encryption
    • Rotate credentials regularly
    • Limit credential access
  3. Connection Security

    • Always use SSL/TLS in production
    • Verify certificates
    • Use strong authentication methods

Network Security

yaml
security:
  allowed_hosts:
    - "10.0.0.0/8"
    - "192.168.0.0/16"
  blocked_hosts:
    - "169.254.0.0/16"
  
  firewall:
    enabled: true
    rules:
      - allow: "443/tcp"
      - allow: "9200/tcp"
      - deny: "all"

Access Control

yaml
permissions:
  read_only: true
  allowed_operations:
    - "select"
    - "search"
  blocked_operations:
    - "insert"
    - "update"
    - "delete"

Troubleshooting

Common Connection Issues

Connection Timeout

yaml
# Increase timeout values
timeout: "60s"
connection_timeout: "30s"
read_timeout: "45s"

SSL Certificate Issues

yaml
ssl:
  verify_certificates: false  # For testing only
  ca_cert_path: "/path/to/correct/ca.pem"

Authentication Failures

  • Verify credentials
  • Check user permissions
  • Review authentication method
  • Test with minimal permissions

Performance Issues

Slow Queries

  • Add appropriate indexes
  • Optimize query structure
  • Use query caching
  • Limit result size

Connection Pool Exhaustion

yaml
pool:
  max_open_connections: 50
  max_idle_connections: 20
  connection_max_lifetime: "30m"

Monitoring and Logs

  1. Enable Debug Logging

    yaml
    logging:
      level: "debug"
      include_queries: true
  2. Query Monitoring

    • Track slow queries
    • Monitor error rates
    • Set up alerts
  3. Health Checks

    • Regular connectivity tests
    • Performance benchmarks
    • Automated failover

Best Practices

Configuration Management

  • Use configuration templates
  • Version control configurations
  • Test in development first
  • Document connection details

Security

  • Use least privilege access
  • Regular credential rotation
  • Monitor access logs
  • Encrypt sensitive data

Performance

  • Connection pooling
  • Query optimization
  • Appropriate caching
  • Regular maintenance

Reliability

  • Health monitoring
  • Failover configuration
  • Backup connections
  • Error handling

Next Steps