Commit e2d93b6b authored by Claas Testuser's avatar Claas Testuser

added Jupyter NBs

parent cbf6bc84
This diff is collapsed.
This diff is collapsed.
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Links\n",
"* A gallery of interesting Jupyter Notebooks : https://github.com/jupyter/jupyter/wiki/A-gallery-of-interesting-Jupyter-Notebooks \n",
"* Can't stop using it… or Python for Excel : https://www.xlwings.org/ \n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Install requirements\n",
"* anaconda - is installed\n",
"* git - is installed\n",
"\n"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.6.1"
}
},
"nbformat": 4,
"nbformat_minor": 2
}
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Official website for OPeNDAP: http://www.opendap.org/ \n",
"Code site: https://github.com/opendap \n",
"Notebook adapted from Python-Intro course by Wili Rath"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Modules\n",
"\n",
"Get a couple of modules needed for the task at hand:\n",
"\n",
"- [netCDF4](http://unidata.github.io/netcdf4-python/) --- netCDF4 library\n",
"- [pyplot](http://matplotlib.org/api/pyplot_api.html) --- Standard plotting routines\n",
"- [basemap](http://matplotlib.org/basemap/) --- Map projections\n",
"- [cmocean](http://matplotlib.org/cmocean/) --- Very nice ocean colormaps\n",
"- [numpy](http://docs.scipy.org/doc/numpy/reference/) --- Numerical toolbox\n",
"\n",
"We provide aliases for the imported modules. In one case (`Basemap`), we only import one object."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"import netCDF4 as nc\n",
"import matplotlib.pyplot as plt\n",
"from mpl_toolkits.basemap import Basemap\n",
"import cmocean as co \n",
"import numpy as np"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Initialization\n",
"\n",
"Initialize the notebook. We want plots to be displayed inline. Also set default figure size."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"%matplotlib inline\n",
"plt.rcParams['figure.figsize'] = (15, 9)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Input Data\n",
"In this example, we are using a publically available dataset we get over the web. \n",
"\n",
"- We use the netCDF4 library to open the netCDF data set using [nc.Dataset()](http://unidata.github.io/netcdf4-python/#netCDF4.Dataset.__init__). Try\n",
"```python\n",
"nc.Dataset?\n",
"```\n",
"and in particular\n",
"```python\n",
"nc.Dataset.__init__?\n",
"```\n",
"to read the documentation shipped with the modules. You can also access data from local files or from other (private) repositories in a similar way.\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"# This is an example of sea surface temperature data for day 73 in 2017 retrieved from NASA's OPeNDAP enabled servers\n",
"url = 'https://oceandata.sci.gsfc.nasa.gov:443/opendap/MODIST/L3SMI/2017/073/T2017073.L3m_DAY_SST_sst_9km.nc'\n",
"data_set = nc.Dataset(url)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"print(data_set)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# get (pointers to) variables \n",
"lon = data_set.variables['lon']\n",
"lat = data_set.variables['lat']\n",
"sst = data_set.variables['sst']\n",
"\n",
"# understand the difference between lon and lon[:]\n",
"print(type(lon))\n",
"print(type(lon[:]))"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# print the actual data \n",
"print(\"Meta-data:\\n\\n\", lon) # meta data \n",
"print(\"Actual data:\\n\\n\", lon[:]) # actual numeric data "
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Shapes of the variables\n",
"\n",
"Inspect the shapes of `lon`, `lat`, `ssh`."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"print(lon.shape, lat.shape, sst.shape)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## A simple plot\n",
"\n",
"- Use `pyplot`'s `pcolormesh` to plot the 2-dimensional data. Print the documentation with:\n",
"```python\n",
"plt.pcolormesh?\n",
"```\n",
"\n",
"\n",
"- Try `plt.colorbar()` and `plt.title()`.\n",
"\n",
"- There are also `plt.[x,y]label`, `plt.axis`, and `plt.grid`."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"plt.pcolormesh(lon, lat, sst[:].squeeze(), cmap='inferno')\n",
"plt.colorbar()\n",
"plt.title('MODIS SST')\n",
"plt.axis('tight')\n",
"plt.xlabel('longitude')\n",
"plt.xlabel('latitude')\n",
"plt.grid()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## A nicer projection\n",
"\n",
"- Use [Basemap](http://matplotlib.org/basemap/api/basemap_api.html#mpl_toolkits.basemap.Basemap) to create a Mercator projection covering the globe between `70S` and `70N`.\n",
"\n",
"- With `m = Basemap(...)` you can later use `m.pcolormesh(...)` to plot the data on the map projection.\n",
"\n",
"- We need 2-dimensional representations of the coordinates:\n",
"```python\n",
"mlon, mlat = np.meshgrid(lon, lat)\n",
"```\n",
"\n",
"- To add coast lines, filled land masses, a grid, etc., you can play with the following:\n",
"```python\n",
"m.drawcoastlines()\n",
"m.fillcontinents()\n",
"m.drawparallels(np.arange(-70,90,20),labels=[1,0,0,0]);\n",
"m.drawmeridians(np.arange(0,420,60),labels=[0,0,0,1]);\n",
"m.colorbar()\n",
"plt.title(\"AVISO ssh (m), %04d\" % (year))\n",
"```\n",
"- Use the [cmocean](http://matplotlib.org/cmocean/) color maps:\n",
"```python\n",
"cmap=co.cm.<mapname goes here>\n",
"```\n",
"\n",
"- Save a file using\n",
"```python\n",
"plt.savefig(plot_file_name)\n",
"```\n",
"where `plot_file_name` should be a string containing the year we picked above."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# set up projection: \n",
"m = Basemap(projection='merc',\n",
" llcrnrlat=-70.0, llcrnrlon=-180.0,\n",
" urcrnrlat=+70.0, urcrnrlon=+180.0)\n",
"\n",
"# 2d coordinates\n",
"mlon, mlat = np.meshgrid(lon, lat)\n",
"\n",
"# 2d plot in the projection: \n",
"m.pcolormesh(mlon, mlat, sst[:].squeeze(),\n",
" rasterized=True,\n",
" cmap=co.cm.balance,\n",
" latlon=True)\n",
"\n",
"# grid etc. (see above)\n",
"m.drawcoastlines()\n",
"m.fillcontinents()\n",
"m.drawcountries()\n",
"m.drawparallels(np.arange(-70,90,20),labels=[1,0,0,0]);\n",
"m.drawmeridians(np.arange(0,420,60),labels=[0,0,0,1]);\n",
"m.colorbar()\n",
"\n",
"# add formatted title\n",
"title = 'MODIS SST [°C]'\n",
"plt.gca().set_title(title)\n",
"\n",
"# save file\n",
"plot_file_name = 'modis_sst.pdf'\n",
"plt.savefig(plot_file_name)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": []
}
],
"metadata": {
"anaconda-cloud": {},
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.6.1"
}
},
"nbformat": 4,
"nbformat_minor": 1
}
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Getting data from a SQL Database\n",
"In this notebook, we will show an example of connecting to a SQL database. \n",
"For simplictiy, we use a local [SQLite](https://www.sqlite.org/) DB for this demo, but connecting to PostgreSQL or MySQL is equally simple. \n",
"\n",
"We are using the familiar [pandas](http://pandas.pydata.org/) to access the DB. Another user friendly Python tool for talking to databases is [records](https://github.com/kennethreitz/records) which is built on top of the powerfull but rather complex [SQLAlchemy](http://www.sqlalchemy.org/). \n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"# import sqlite library\n",
"import sqlite3\n",
"import pandas as pd"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Connect to the DB\n",
"How to establish a connection to the database will be slightly different for each DB System. For SQLite, we need to provide a connection object. \n",
"For PostgresQL and other DB Systems, a connection string providing hostname, port, user and password is needed most of the time. \n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"# with sqlite, connecting to a DB is done by providing the path to the local DB file\n",
"connection = sqlite3.connect('data/chinook.sqlite')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Define SQL query\n",
"You can either get a complete table from the database or provide a query to specify what data you are interested in. Most of the time, a query combining data from multiple tables is what is needed. The resulting SQL query can become complex quickly. \n",
"If you can, use a graphical query builder to make a query template. In the template, insert placeholders that will replaced later dynamically. In our example, we query a music database to compile a list of albums for an artist:\n",
"```sql\n",
" SELECT \n",
" Artist.name as artist, \n",
" Album.Title as album \n",
" FROM \n",
" Artist, \n",
" Album\n",
" WHERE\n",
" Artist.name like :artist_name AND \n",
" Album.ArtistId = Artist.ArtistId\n",
"```\n",
"Note that the name of the artist is not set yet but is instead given by the placeholder `:artist_name`"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"query = \"\"\"\n",
" SELECT \n",
" Artist.name as artist, \n",
" Album.Title as album \n",
" FROM \n",
" Artist, \n",
" Album\n",
" WHERE\n",
" Artist.name like :artist_name AND \n",
" Album.ArtistId = Artist.ArtistId\n",
" \"\"\""
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Executing the query\n",
"We pass our db connection and query template to pandas `pd.read_sql_query` method to receive a dataframe containing our information. \n",
"Note that we pass our artist name inside as a parameter object instead of replacing it in the query string directly. This is done to ensure safe conversion from python datatypes to SQL types. Also remember to *never* use user input directly in your queries, this is unsafe and considered very bad practise.\n",
"\n",
"To find out more about the method, use \n",
"```python\n",
"?pd.read_sql_query\n",
"```\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"artist = 'Chris Cornell'\n",
"pd.read_sql_query(con=connection, sql=query, params={'artist_name':artist})"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# We can also use SQL placeholders in our query:\n",
"artist = 'Chris%'\n",
"pd.read_sql_query(con=connection, sql=query, params={'artist_name':artist})"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Once the query is defined, it can be used to process data dynamically"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from IPython.display import display\n",
"artists = ['Chris%', 'Stevie%', 'Ozzy%', 'David%']\n",
"for artist in artists:\n",
" df = pd.read_sql_query(con=connection, sql=query, params={'artist_name':artist})\n",
" display(df)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.6.1"
}
},
"nbformat": 4,
"nbformat_minor": 2
}
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment