If you are using MySQLdb, the drill is simple. You just instantiate your cursor with SSCursor, and boom, this cursor serve you as a server-side streaming cursor. The following snippet is a demonstration.
import MySQLdb.cursors import MySQLdb conn = MySQLdb.connect(host=host, user=user, passwd=password, db=db) cursor = SSCursor(conn) query = "SELECT * FROM big_table;" cursor.execute(query); # rowcount wouldn't work here
For those projects using SQLAlchemy, you may have tried conn.execution_options(stream_results=True) and fruitless, it still consumes a lot of memory. Since the flag doesn't work with MySQL.
Cheers, Love! The Cavalry's Here! There is another solution there, since SQLAlchemy is based on MySQLdb, they have actually provided a way to inject the SSCrusor. Here is an example.
import sqlalchemy import MySQLdb.cursors CHUNK_SIZE = 10000 url = "mysql://%s:%s@%s:%d/%s" % (user, password, host, port, db) query = "SELECT * FROM big_table;" conn = sqla.create_engine(url, encoding="utf-8", connect_args={"cursorclass": MySQLdb.cursors.SSCursor}) cursor = conn.execute(query) rows = cursor.fetchmany(CHUNK_SIZE) while(len(rows) > 0): # do whatever it is you do to the data rows = cursor.fetchmany(CHUNK_SIZE)Of course, you would like to read the streaming result as some chunks, so the overwhelming rows number wouldn't cause a network transferring bottleneck.
Another thing to mention. While enabling the streaming feature, rowcount wouldn't work. Probably because the results are stored in server-side, it's impossible to read how many rows it actually be.
No comments:
Post a Comment