Skip to content
Generic filters
Exact matches only

Auto-Reflecting Tables and Columns in SQLAlchemy

Or How a Few Lines of Code Makes me Want to do Cartwheels Down the Street

Eric North
Courtesy of David Clode on Upslash

Groan, not another SQLAlchemy article!

Trust me, this one is different…

After a lot of soul-searching and digging around the SQAlchemy docs, I’ve made an important discovery: You don’t need to use class-based attributes in order to leverage the awesome power of session-based queries.

Code in this article is licenced under the Apache 2.0 software licence. The following is a portion of the Python code to be discussed:

Simple, right?

Well, kind of: allow me to explain.

I have authored a couple of Python scripts that use Pandas and SQLite. For those who have not yet had the pleasure of working with Pandas it is an amazing package with tons of applications in Data Science and many other fields. I marvel at the simplicity at which one can specify various keyword arguments to various Pandas functions in order to achieve amazing results. My scripts use SQLite databases to store incoming, ‘mid-stream’ and result datasets. While SQLite is great for prototyping I am now at a point where I need more.

Like authentication.

I could port my existing code to be able to point to a different database, but what I really want is something dialect-agnostic. Enter SQLAlchemy and its powerful Object Relational Mapper (ORM). It allows developers to interact with their databases as Python-friendly classes. Very cool!

One of SQLAlchemy’s huge benefits is session-based queries which employ Python object-oriented syntax rather than SQL statements. For those who have worked with two or more database technologies, I feel somewhat sheepish in spelling out why this is beneficial. For those who have not had the pleasure, it is like hiring a translator when speaking with each of the outgoing, highly-conversational people on an around-the-world Zoom call.

SQLAlchemy’s ORM is your translator. Having ‘object’ in the title means it is keen on… objects. There are a couple of packages such as Elixir and SQLSoup that extend SQLAlchemy; they too use an object-oriented approach. Which is fine if your code uses (drum roll please…) objects.

One slight problem is that some *ahem* most of my code uses JavaScript Object Notation (JSON) to configure parameters at run-time. I have a hockey-sock full of riveting tidbits about JSON: these are the subject for another article. Suffice to say that many of my Python scripts use string-based parameters obtained from JSON for their configuration.

After mucking around in the SQLAlchemy documentation combined with some inspection via the Eclipse debugger, I cobbled together a small but hefty code snippet that enables me to refer to each and every one of my tables and columns within a given database by name, vice having to specify them as objects and attributes.

When I first saw an error-free console after cobbling this code together I wanted to do cartwheels down the street!

Some may be asking ‘so what, why should I care’? Perhaps a more fullsome example is appropriate. Before diving into the code below, if you are wondering how to configure a database using SQLAlchemy I recommend you check out the following helpful articles: SQLAlchemy — Python Tutorial by Vinay Kudari and How to use Python SQLite3 using SQLAlchemy by Mahmud Ahsan.

Once you’ve got your Python environment configured along with a test database, go ahead and load the following snippet into your IDE of choice:

With a SQLAlchemy engine configured you can supply it to the get_data function along with the name of a table and column from your test database. Say our test database contains a table called ‘ThisIsATable’ and in it a column called ‘ThisIsAColumn’. Within ‘ThisIsAColumn’ we have several text-based rows, at least one of which contains ‘Please try to find me’.

If all goes well you should have a Pandas DataFrame containing a sub-set of the table from your test database. Thanks to a few lines of code we are now free to write our database-agnostic queries. With these queries, we can obtain data for Pandas to chew on, thanks to simple string-based parameters.

From an ‘easy to maintain’ perspective, the code leverages public functions and attributes. Calling on private attributes is generally ill-advised since these could change in a future release; I tend to shy away from private attributes whenever possible. A point worth mentioning is that the Session in the above code is invoked within the function get_data. This is not the recommended practice. Please see this article by A Gordon for further information on creating and using Sessions. The SQLAlchemy docs also contain information on creating and using Sessions, specifically the section entitled ‘When do I construct a Session, when do I commit it, and when do I close it?’.

Armed with the above code, you and your proof-of-concept Python scripts are now able to autoload and reflect database metadata by specifying only a bare minimimum of information via handy-dandy strings.

Strings! Such wonderful things.

Say it with me: Strings are wonderful things!

In my opinion, this represents the dawn of advanced querying via the database-agnostic goodness of SQLAlchemy, together with the hugely powerful Pandas library combined with string-based keyword arguments! I think another snake image is appropriate: it sums up the relationship amongst SQLAlchemy, Pandas and text-based interaction. Together, they form a trifecta that is primed for the limitless expanse of Natural Language Processing along with many other use cases.

Snake image courtesy of Laura Barry on Upslash

Happy Pythoning and stay safe out there!