Pooling of database connections using Python, pymysql and dbutils

Overview of Database Connection Pooling:

  • A database connection to a remote database server like a MySQL Server is a TCP/IP connection established by contacting the database server on a specific port.
  • For each database connection established from a client, a database server such as MySQL Server allocates a set of buffers to cater to the client needs. Based on design - the allocation, management and de-allocation of these buffers could get as complicated as it could on the server side.
  • When a database connection is closed, all of the allocated buffers are cleared in a graceful manner and the TCP sockets are closed as well.
  • The overhead involved equals to the number of database connections made to the server multiplied by the time and resources allocated per connection.
  • For database applications running in various layers of the deployment of distributed software architecture, it is often necessary to deal with several database connections.
  • However, opening these database connections every time afresh on need basis will result in poor performance and will affect the end use cases of applications.
  • To address the above issue(s), database connections are often created in a batch, cached, used, marked for reuse and reused, which is together known as pooling of database connections.

 

Database Connection pooling with PyMySQL and DBUtils:

  • The PyMySQL is a client library for MySQL Server, which is compliant to Python Database API specification version 2.0.
  • PyMySQL is developed fully using the Python Programming Language.
  • The DBUtils is a Python library, which provides a database-connection-pooling framework for multithreaded Python environments.
  • DBUtils provides the PooledDB class, which represents a pool of database connections to a database server like MySQL.
  • The constructor of DBUtils.PooledDB takes a Python method as parameter, which returns a Python DB-API specification 2.0 compliant connection object or a python database module, compliant to the Python DB-API specification 2.0.
  • Calling connection() method on a PooledDB instance will return a connection(a PooledSharedDBConnection instance)from the connection pool.
  • Calling close() on the PooledSharedDBConnection instance will return the database connection for reuse.
  • The parameter maxconnections of the PooledDB constructor specifies the maximum number of connections that can be created for a database connection pool. If the blocking parameter is True, the requests for a connection to the pool will block till a connection is released.
  • If False, no blocking will happen upon exceeding the maximum number of connections available in the connection pool which could lead to the exception DBUtils.PooledDB.TooManyConnections.

 

Example:

from threading import Thread

from datetime import datetime

import time

import pymysql

from DBUtils.PooledDB import PooledDB

 

# Definition of CrawlInfo

class CrawlInfo:

    def __init__(self, url_in, header_in, body_in, lastVisit_in):

        self.url             = url_in

        self.header          = header_in

        self.body            = body_in

        self.lastVisit      = lastVisit_in

 

# The thread function for MySQL Insert

# This thread will insert records into MySQL table using the connections

# from connection pool

def MySQLInsertThread(threadNum_in, dbConnection_in, crawlInfo_in):

    try:

        print("Thread{} started".format(threadNum_in))

       

        url         = crawlInfo_in.url

        header      = crawlInfo_in.header

        body        = crawlInfo_in.body

        lastVisit   = crawlInfo_in.lastVisit     

       

        # SQL statement to insert into a MySQL database table

        sqlInsert   = "INSERT INTO crawled values ('{}','{}','{}','{}')".format(url, header, body, lastVisit)

 

        # Obtain a cursor object

        mySQLCursor = dbConnection_in.cursor()     

   

        # Execute the SQL stament

        mySQLCursor.execute(sqlInsert)

       

        # Close the cursor and connection objects

        mySQLCursor.close()

        dbConnection_in.close()       

       

        print("Thread{} exiting".format(threadNum_in))

   

    except Exception as e:

        print("Exception: %s" % e)

    return

 

# Information required to create a connection object

dbServerIP              = "127.0.0.1"       # IP address of the MySQL database server

dbUserName              = "root"            # User name of the MySQL database server

dbUserPassword          = ""                # Password for the MySQL database user

databaseToUse           = "TestDatabase"    # Name of the MySQL database to be used

charSet                 = "utf8mb4"         # Character set

cusrorType              = pymysql.cursors.DictCursor

Crawl_Info_Count        = 10

 

mySQLConnectionPool = PooledDB(creator   = pymysql,     # Python function returning a connection or a Python module, both based on DB-API 2

                               host      = dbServerIP,

                               user      = dbUserName,

                               password  = dbUserPassword,

                               database  = databaseToUse,

                               autocommit    = True,

                               charset       = charSet,

                               cursorclass   = cusrorType,

                               blocking      = False,

                               maxconnections = Crawl_Info_Count)

   

   

threadCollection    = []

crawlInfoCollection = []

 

# Create data to insert data into MySQL table

for i in range(Crawl_Info_Count):

    url         = "url"+str(i)

    header      = "header"+str(i)

    body        = "body"+str(i)

    lastVisit   = time.strftime('%Y-%m-%d %H:%M:%S')

   

    crawlInfoCollection.append(CrawlInfo(url, header, body, lastVisit))

 

# Create threads to insert data into MySQL table

   

 

for i in range(Crawl_Info_Count):

    mySQLConnection = mySQLConnectionPool.connection()

                             

    t = Thread(target= MySQLInsertThread, args=(i, mySQLConnection,crawlInfoCollection[i]))

    threadCollection.append(t)

   

    # Start the db insert thread

    t.start()

    print("Number of database connections in use:{}".format(mySQLConnectionPool._connections))

   

 

# Proceed once the other threads are complete

for thread in threadCollection:

    thread.join()

 

 

Output:

Thread0 started

Number of database connections in use:1

Thread1 started

Number of database connections in use:2

Thread2 started

Number of database connections in use:3

Thread3 started

Number of database connections in use:4

Thread4 started

Number of database connections in use:5

Thread5 started

Number of database connections in use:6

Thread6 started

Number of database connections in use:7

Thread7 started

Number of database connections in use:8

Thread8 started

Number of database connections in use:9

Thread9 started

Number of database connections in use:10

Thread0 exiting

Thread1 exiting

Thread8 exiting

Thread5 exiting

Thread9 exiting

Thread4 exiting

Thread3 exiting

Thread6 exiting

Thread7 exiting

Thread2 exiting


Copyright 2024 © pythontic.com