In this notebook, we will show an example of connecting to a SQL database.
For simplictiy, we use a local [SQLite](https://www.sqlite.org/) DB for this demo, but connecting to PostgreSQL or MySQL is equally simple.
We are using the familiar [pandas](http://pandas.pydata.org/) to access the DB. Another user friendly Python tool for talking to databases is [records](https://github.com/kennethreitz/records) which is built on top of the powerfull but rather complex [SQLAlchemy](http://www.sqlalchemy.org/).
%% Cell type:code id: tags:
``` python
# import sqlite library
importsqlite3
importpandasaspd
```
%% Cell type:markdown id: tags:
## Connect to the DB
How to establish a connection to the database will be slightly different for each DB System. For SQLite, we need to provide a connection object.
For PostgresQL and other DB Systems, a connection string providing hostname, port, user and password is needed most of the time.
%% Cell type:code id: tags:
``` python
# with sqlite, connecting to a DB is done by providing the path to the local DB file
connection=sqlite3.connect('data/chinook.sqlite')
```
%% Cell type:markdown id: tags:
## Define SQL query
You can either get a complete table from the database or provide a query to specify what data you are interested in. Most of the time, a query combining data from multiple tables is what is needed. The resulting SQL query can become complex quickly.
If you can, use a graphical query builder to make a query template. In the template, insert placeholders that will replaced later dynamically. In our example, we query a music database to compile a list of albums for an artist:
```sql
SELECT
Artist.nameasartist,
Album.Titleasalbum
FROM
Artist,
Album
WHERE
Artist.namelike:artist_nameAND
Album.ArtistId=Artist.ArtistId
```
Note that the name of the artist is not set yet but is instead given by the placeholder `:artist_name`
%% Cell type:code id: tags:
``` python
query="""
SELECT
Artist.name as artist,
Album.Title as album
FROM
Artist,
Album
WHERE
Artist.name like :artist_name AND
Album.ArtistId = Artist.ArtistId
"""
```
%% Cell type:markdown id: tags:
## Executing the query
We pass our db connection and query template to pandas `pd.read_sql_query` method to receive a dataframe containing our information.
Note that we pass our artist name inside as a parameter object instead of replacing it in the query string directly. This is done to ensure safe conversion from python datatypes to SQL types. Also remember to *never* use user input directly in your queries, this is unsafe and considered very bad practise.