Writing data to and Querying from influxdb using Python

Overview:

  • InfluxDB is a high performance time series database, capable of storing and retrieving data points of high volume - in the order of few million points per second. Points or time series data could be anything like CPU measurements, Log entries from various servers, sensor data, stock market data and so on.
  • InfluxDB is developed in GO language and ships as a library of single binary .
  • InfluxDB provides an SQL-like interface for querying data.
  • Automatic compression of data along with down sampling helps in minimizing the storage space.
  • Expiring the old data in the time series database is enabled through continuous queries and data retention policies.

Schema of InfluxDB:

  • InfluxDB stores data as Points.
  • Every point is written or composed for storage in the database using the Line Protocol.
  • As per Line Protocol of InfluxDB, every line or a record is called a point in the time series. This is the row equivalent of an RDBMS.
  • Every line starts with the name of the measurement - equivalent to a table name in the RDBMS parlance.
  • Following the measurement name comes tags and fields. Tags are indexed and fields are not.
  • Every type of measurement is stored under a database. Databases can be created from the InfluxDB console.
  • The schema is very flexible. Fields and tags can be added in the data itself along with the name of the tag or field. This means some points can have more fields/tags and some can have very few.  As the data file grows, new fields can be added at any point/record/row in the time series.
  • An example showing two data points of time series data using the line protocol:

UserLogins,Area=North America,Location=New York City,ClientIP=192.168.0.256,SessionDuration=1.2

UserLogins,Area=South America,Location=Lima,ClientIP=192.168.1.256,SessionDuration=2.0

 

 

Accessing InfluxDB using Python:

  • Using the Python Client library, InfluxDB-Python the InfluxDB can be accessed - data can be written and retrieved from a Python Program.
  • The Python Client library influxdb-python can be installed using the command

        $ pip install influxdb

  • The example Python program below, creates a database connection to the InfluxDB server using the following information -
    • IP address/host name of the Time series database server
    • Port number
    • Access credentials of the user
    • Name of the database
  • The example uses the write_points and query methods of the InfluxDBClient object for writing and reading time series data.

      

Example:

from influxdb import InfluxDBClient

 

loginEvents = [{"measurement":"UserLogins",

        "tags": {

            "Area": "North America",

            "Location": "New York City",

            "ClientIP": "192.168.0.256"

        },

        "fields":

        {

        "SessionDuration":1.2

        }       

        },

       

        {"measurement":"UserLogins",

          "tags": {

            "Area": "South America",

            "Location": "Lima",

            "ClientIP": "192.168.1.256"

        },

        "fields":

        {

        "SessionDuration":2.0

        }       

        }        

        ]

 

dbClient = InfluxDBClient('localhost', 8086, 'root', 'root', 'AccessHistory')

 

# Write the time series data points into database - user login details

dbClient.create_database('AccessHistory')

dbClient.write_points(loginEvents)

 

# Query the IPs from logins have been made

loginRecords = dbClient.query('select * from UserLogins;')

 

# Print the time series query results

print(loginRecords)

 

Output:

ResultSet({'('UserLogins', None)': [{'time': '2018-11-12T20:36:39.378904Z', 'Area': 'North America', 'ClientIP': '192.168.0.256', 'Location': 'New York City', 'Serial': None, 'SessionDuration': 1.2}, {'time': '2018-11-12T20:36:39.378904Z', 'Area': 'South America', 'ClientIP': '192.168.1.256', 'Location': 'Lima', 'Serial': None, 'SessionDuration': 2}]})

 

 


Copyright 2024 © pythontic.com