Problem, of course
There was a specific problem I had to solve, namely, on how to make Ubuntu based Django project and remote MS SQL installation talk to each other. Obviously, that wasn’t really a problem, the problematic part was accessing legacy database with tables under a non-standard schema.
Public knowledge, good or bad?
PostgreSQL uses "public"
as default schema, MS SQL uses [database].[dbo]
But what do you do when tables are stored under an arbitrary schema?
A lot of articles and tutorials say that you can use models.Model
meta data to point Django into right direction but what I found out is that:
- SO suggests that
db_tablespace
applies only for Oracle - there is a fork of PostgreSQL drivers that supports
db_schema
- almost everything else outside regular Django development is a custom solution I wouldn’t want to touch for a long run projects. Not yet, at least.
How to make it work
Install everything necessary to be able to access the database.
ODBC driver
First, choose a correct driver. FreeTDS in my case worked just fine.
apt-get install freetds-common freetds-bin tdsodbc unixodbc apt-get build-dep pyodbc
Python ODBC connector
I also installed pyodbc directly from source, version 3.0.7. No idea why but pip installer didn’t find it, so I used
pip install https://pyodbc.googlecode.com/files/pyodbc-3.0.7.zip
Django driver
Next we need a django driver, that can talk to Django 1.6. I chose django-pyodbc-azure version 1.1.5, make sure the version is correct. Pip install will do. Don’t use the code on GoogleCode site, it’s not maintained anymore.
Final picture, for starters
/etc/odbc.ini [myDS] Driver = FreeTDS Description = ODBC connection for my base Server = db_host_name Database = db_name Port = 1433 /etc/odbcinst.ini [FreeTDS] Description = TDS driver (Sybase/MS SQL) Driver = /usr/lib/x86_64-linux-gnu/odbc/libtdsodbc.so Setup = /usr/lib/x86_64-linux-gnu/odbc/libtdsS.so /etc/freetds/freetds.conf [driver_name] host = db_host_name port = 1433 tds version = 8 client charset = UTF-8 /django_app/settings.py DATABASES = { 'default': { 'ENGINE': 'sql_server.pyodbc', 'NAME': 'db_name', 'USER': 'db_user', 'PASSWORD': 'db_user_password', 'HOST': 'db_host_IP_address', 'PORT': '1433', 'OPTIONS': { 'host_is_server': True, } }, 'custom_alias': { 'ENGINE': 'sql_server.pyodbc', 'NAME': 'db_name', 'USER': 'db_user', 'PASSWORD': 'db_user_name', 'HOST': 'db_host_IP_address', 'PORT': '1433', 'OPTIONS': { 'host_is_server': True, 'dsn': 'myDS', 'driver' : 'FreeTDS', 'extra_params' : 'TDS_VERSION=8.0' } } } DATABASE_ROUTERS = ['package.router']
Summary, up to here. Plenty to go, still.
What I learned in the process up to here is the following:
- whenever possible, enter the IP address instead of host name
- if you’re working on the 64 bit OS make sure to adjust filepaths of the files mentioned in the tutorials
- never name datasources, drivers and connections the same name. just don’t, as a general rule.
- when you have multiple tutorials on how to do a certain thing, spanned through a couple of years period… well, rely on latest, but read them all just to be informed. what for, you ask? troubleshooting.
Model routing
In the end (hah, not over yet!), I had a project with a single app, that used two sets of models: mine and those of legacy database. First of all, I had to tell Django on how to separate which models go where.
Database routers. Read this, it’s pretty simple. Also, try to understand it. There’s sample code, pretty straight forward. Create a class and include it in the project settings with DATABASE_ROUTERS = [your.package]
Of course I was trying to make it simple, named one group of models with a Prefix and the other group with OtherPrefix, then tried to relate to model._meta.verbose_name.lower()
without checking and had trouble, because verbose_name
splits prefixes with a space preceeding a capital letter, ergo I got: "other prefix"
instead of expected "otherprefix"
. #selffacepalm
Well, read the documentation and make sure with tests everthing works as expected! Do it at home, kids, especially if you’re not professionals. Debug your model router, just to make sure it’s correctly declared and used.
Some documentation said that I can omit the default
database alias (dont’ delete it, just leave it empty) and have two of my own, but as soon as I did that, south
complained. Well, it wasn’t that necessary and I stopped with default
and a custom one.
Last step… schemas
Once I got it all going, there was still a problem of the schema naming. How do I put that information into… well, whatever piece of code that wants to recognize it?!
My models lived in standard [dbo]
schema, which was ok. But the legacy database?
I found two solutions, both ugly as… let’s leave that out, but please, if there’s other options, possibly from within Django, let me know. Use this until there’s proper schema naming solution in Django.
Don’t even bother saying “It’s only temporary.”
“Temporary solutions stay in code the longest”. Every senior dev knows it. Ask them and ignore the blush.
Private aliases
Create aliases in your [dba]
schema that point to tables in other schemas. It works but I don’t like it. It’s a solution for a flaw in your framework, that’s why I don’t want it to live outside of it.
String hack
Well, this part is not really a hack yet, it was a necessity. Set all the models’ meta class property db_table
to respective table names in the legacy database, for example
db_table = 'Table'
Following the models’ dataset queries and translation within django-pyodbc module I figured out all the queries are translated to ODBC flavor of queries, and that my legacy models end up with [database].[table]
notation.
And now the hack part.
Let’s do simple string injection, shall we? Suppose our target schema name is [tgt]
. All you have to do is rename all your table names with a simple prefix and end up with this:
db_table = 'tgt].[Table'
And, to be a bit flexible, define the schema name as a constant in the setting, if the schemas differ between staging and production environments.
Ta-da.wav!!!
Thanks! The part about schema was really helpfull.
Glad it’s not just me with these problems. I become paranoid. 😉
Great post. Quick question: do you know if any large companies use Django with MSSQL, in a large-scale environment? I’d like to get some feedback on how it works in a large environment and any problems areas that they’ve encountered.
Hi Mitch! Sorry for a late answer, but the system I set up is a production system with a couple of servers producing somewhat 400k records daily and runs queries/analytic stuff over it in the backend for the past year without any issues.
Thanks for the reply. Glad to hear that it can handle that kind of load.