Commit 76fe33fd authored by Claas Faber's avatar Claas Faber
Browse files

modified demo notebooks

parent 6b180174
Loading
Loading
Loading
Loading
+253 −34

File changed.

Preview size limit exceeded, changes collapsed.

+44 −18
Original line number Diff line number Diff line
%% Cell type:code id: tags:

``` python
print('hello world')
```

%% Output

    hello world

%% Cell type:markdown id: tags:

# Example Notebook
this is an example
- python code
- markdown
- cool stuff
- also [links](https://www.google.com)

Execute shell commands:

%% Cell type:code id: tags:

``` python
!ls -la
```

%% Output

    total 128
    drwxr-xr-x  13 cfaber  staff    442 Apr  6 11:23 .[m[m
    drwxr-xr-x   7 cfaber  staff    238 Apr  6 10:20 ..[m[m
    drwxr-xr-x  15 cfaber  staff    510 Apr  6 10:37 .git[m[m
    drwxr-xr-x   3 cfaber  staff    102 Apr  6 11:21 .ipynb_checkpoints[m[m
    -rw-r--r--   1 cfaber  staff    907 Apr  6 11:23 Untitled.ipynb
    -rw-r--r--   1 cfaber  staff  13935 Apr  6 10:20 boknis.ipynb
    -rw-r--r--   1 cfaber  staff  12369 Apr  6 10:20 boknis_start.ipynb
    drwxr-xr-x   4 cfaber  staff    136 Apr  6 10:20 data[m[m
    -rw-r--r--   1 cfaber  staff    227 Apr  6 10:20 dm-tools_py3.yml
    -rw-r--r--   1 cfaber  staff    940 Apr  6 10:20 jupyter-notebook-basics.ipynb
    -rw-r--r--   1 cfaber  staff   7725 Apr  6 10:20 jupyter-notebook-for-OPeNDAP-data-access.ipynb
    -rw-r--r--   1 cfaber  staff   5449 Apr  6 10:20 sql_db.ipynb
    -rw-r--r--   1 cfaber  staff     10 Apr  6 10:37 testfile
    total 8968
    drwxr-xr-x  23 cfaber  staff      782 Apr  7 12:53 .[m[m
    drwxr-xr-x   7 cfaber  staff      238 Apr  6 10:20 ..[m[m
    drwxr-xr-x  15 cfaber  staff      510 Apr  7 12:44 .git[m[m
    drwxr-xr-x   7 cfaber  staff      238 Apr  7 12:11 .ipynb_checkpoints[m[m
    -rw-r--r--@  1 cfaber  staff   560270 Apr  7 12:52 boknis.html
    -rw-r--r--   1 cfaber  staff   290268 Apr  7 12:44 boknis.ipynb
    -rw-r--r--@  1 cfaber  staff   399174 Apr  7 12:42 boknis.pdf
    drwxr-xr-x   5 cfaber  staff      170 Apr  7 12:52 boknis_files[m[m
    drwxr-xr-x   6 cfaber  staff      204 Apr  7 12:35 data[m[m
    -rw-r--r--   1 cfaber  staff      227 Apr  6 10:20 dm-tools_py3.yml
    -rw-r--r--@  1 cfaber  staff  1187446 Apr  7 12:52 jupyter-notebook-for-OPeNDAP-data-access.html
    -rw-r--r--   1 cfaber  staff   904979 Apr  6 11:58 jupyter-notebook-for-OPeNDAP-data-access.ipynb
    -rw-r--r--@  1 cfaber  staff   618594 Apr  7 12:45 jupyter-notebook-for-OPeNDAP-data-access.pdf
    drwxr-xr-x   5 cfaber  staff      170 Apr  7 12:52 jupyter-notebook-for-OPeNDAP-data-access_files[m[m
    -rw-r--r--@  1 cfaber  staff   264853 Apr  7 12:53 notebook_intro.html
    -rw-r--r--   1 cfaber  staff     4507 Apr  7 12:40 notebook_intro.ipynb
    -rw-r--r--@  1 cfaber  staff    18230 Apr  7 12:45 notebook_intro.pdf
    drwxr-xr-x   5 cfaber  staff      170 Apr  7 12:53 notebook_intro_files[m[m
    -rw-r--r--@  1 cfaber  staff   273349 Apr  7 12:51 sql_db.html
    -rw-r--r--   1 cfaber  staff    12209 Apr  7 12:50 sql_db.ipynb
    -rw-r--r--@  1 cfaber  staff    31421 Apr  7 12:50 sql_db.pdf
    drwxr-xr-x   5 cfaber  staff      170 Apr  7 12:51 sql_db_files[m[m
    -rw-r--r--   1 cfaber  staff       10 Apr  6 10:37 testfile

%% Cell type:markdown id: tags:

You can run any software that is available through the command line, e.g. git:

%% Cell type:code id: tags:

``` python
!git status
```

%% Output

    On branch master
    Your branch is ahead of 'origin/master' by 2 commits.
    Your branch is ahead of 'origin/master' by 9 commits.
      (use "git push" to publish your local commits)
    Changes not staged for commit:
      (use "git add/rm <file>..." to update what will be committed)
      (use "git add <file>..." to update what will be committed)
      (use "git checkout -- <file>..." to discard changes in working directory)
    
    	deleted:    test.txt[m
    	modified:   boknis.ipynb[m
    	modified:   jupyter-notebook-for-OPeNDAP-data-access.ipynb[m
    	modified:   notebook_intro.ipynb[m
    	modified:   sql_db.ipynb[m
    
    Untracked files:
      (use "git add <file>..." to include in what will be committed)
    
    	.ipynb_checkpoints/[m
    	Untitled.ipynb[m
    	boknis.html[m
    	boknis.pdf[m
    	boknis_files/[m
    	data/boknis.db[m
    	data/map_BoknisEck.jpg[m
    	jupyter-notebook-for-OPeNDAP-data-access.html[m
    	jupyter-notebook-for-OPeNDAP-data-access.pdf[m
    	jupyter-notebook-for-OPeNDAP-data-access_files/[m
    	notebook_intro.html[m
    	notebook_intro.pdf[m
    	notebook_intro_files/[m
    	sql_db.html[m
    	sql_db.pdf[m
    	sql_db_files/[m
    
    no changes added to commit (use "git add" and/or "git commit -a")

%% Cell type:markdown id: tags:

## Links
* A gallery of interesting Jupyter Notebooks : https://github.com/jupyter/jupyter/wiki/A-gallery-of-interesting-Jupyter-Notebooks
* Can't stop using it… or Python for Excel : https://www.xlwings.org/


**Remember**: you can save Notebooks as static documents, e.g. pdf or html!

%% Cell type:code id: tags:

``` python
```
+15 −3
Original line number Diff line number Diff line
%% Cell type:markdown id: tags:

## Getting data from a SQL Database
In this notebook, we will show an example of connecting to a SQL database.
For simplictiy, we use a local [SQLite](https://www.sqlite.org/) DB for this demo, but connecting to PostgreSQL or MySQL is equally simple.

We are using the familiar [pandas](http://pandas.pydata.org/) to access the DB. Another user friendly Python tool for talking to databases is [records](https://github.com/kennethreitz/records) which is built on top of the powerfull but rather complex [SQLAlchemy](http://www.sqlalchemy.org/).

%% Cell type:code id: tags:

``` python
# import sqlite library
import sqlite3
import pandas as pd
```

%% Cell type:markdown id: tags:

## Connect to the DB
How to establish a connection to the database will be slightly different for each DB System. For SQLite, we need to provide a connection object.
For PostgresQL and other DB Systems, a connection string providing hostname, port, user and password is needed most of the time.

%% Cell type:code id: tags:

``` python
# with sqlite, connecting to a DB is done by providing the path to the local DB file
connection = sqlite3.connect('data/chinook.sqlite')
```

%% Cell type:markdown id: tags:

## Define SQL query
You can either get a complete table from the database or provide a query to specify what data you are interested in. Most of the time, a query combining data from multiple tables is what is needed. The resulting SQL query can become complex quickly.
If you can, use a graphical query builder to make a query template. In the template, insert placeholders that will replaced later dynamically. In our example, we query a music database to compile a list of albums for an artist:
```sql
    SELECT
        Artist.name as artist,
        Album.Title as album
    FROM
        Artist,
        Album
    WHERE
        Artist.name like :artist_name AND
        Album.ArtistId = Artist.ArtistId
```
Note that the name of the artist is not set yet but is instead given by the placeholder `:artist_name`

%% Cell type:code id: tags:

``` python
query = """
        SELECT
            Artist.name as artist,
            Album.Title as album
        FROM
            Artist,
            Album
        WHERE
            Artist.name like :artist_name AND
            Album.ArtistId = Artist.ArtistId
        """
```

%% Cell type:markdown id: tags:

## Executing the query
We pass our db connection and query template to pandas `pd.read_sql_query` method to receive a dataframe containing our information.
Note that we pass our artist name inside as a parameter object instead of replacing it in the query string directly. This is done to ensure safe conversion from python datatypes to SQL types. Also remember to *never* use user input directly in your queries, this is unsafe and considered very bad practise.

To find out more about the method, use
```python
?pd.read_sql_query
```

%% Cell type:code id: tags:

``` python
artist = 'Chris Cornell'
pd.read_sql_query(con=connection, sql=query, params={'artist_name':artist})
```

%% Output

              artist     album
    0  Chris Cornell  Carry On

%% Cell type:code id: tags:

``` python
# We can also use SQL placeholders in our query:
artist = 'Chris%'
pd.read_sql_query(con=connection, sql=query, params={'artist_name':artist})
```

%% Output

                    artist                     album
    0        Chris Cornell                  Carry On
    1  Christopher O'Riley  SCRIABIN: Vers la flamme

%% Cell type:markdown id: tags:

Once the query is defined, it can be used to process data dynamically

%% Cell type:code id: tags:

``` python
from IPython.display import display
artists = ['Chris%', 'Stevie%', 'Ozzy%', 'David%']
for artist in artists:
    df = pd.read_sql_query(con=connection, sql=query, params={'artist_name':artist})
    display(df)
```

%% Cell type:code id: tags:
%% Output



``` python
```