{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Introduction to data frames\n", "\n", "Start by loading the usual plotting libraries." ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "import numpy as np\n", "import matplotlib.pyplot as plt\n", "%matplotlib inline\n", "# Make plots look a little bit more fancy\n", "plt.style.use('fivethirtyeight')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "[Pandas](https://pandas.pydata.org) is a Python package that\n", "implements data frames, and functions that operate on data\n", "frames." ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "lines_to_next_cell": 2 }, "outputs": [], "source": [ "import pandas as pd" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Data frames and series\n", "\n", "We start by loading data from a Comma Separated Value file (CSV\n", "file). If you are running on your laptop, you should download\n", "the [gender_stats.csv]({{ site.baseurl }}/data/gender_stats.csv)\n", "file to the same directory as this notebook." ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [], "source": [ "# Load the data file\n", "gender_data = pd.read_csv('gender_stats.csv')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This is our usual assignment statement. The LHS is `gender_data`, the variable name. The RHS is an expression, that returns a value.\n", "\n", "What type of value does it return?" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "pandas.core.frame.DataFrame" ] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ "type(gender_data)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Pandas integrates with the Notebook, so, if you display a data\n", "frame in the notebook, it does a nice display of rows and\n", "columns." ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
countryfert_rategdphealth_exp_per_caphealth_exp_pubprim_ed_girlsmat_mort_ratiopopulation
0Afghanistan4.9545001.996102e+10161.1380342.83459840.109708444.003.271584e+07
1Albania1.7692501.232759e+10574.2026942.83602147.20108229.252.888280e+06
2Algeria2.8660001.907346e+11870.7665084.98425247.675617142.503.909906e+07
3American SamoaNaN6.405000e+08NaNNaNNaNNaN5.542200e+04
4AndorraNaN3.197538e+094421.2249337.26028147.123345NaN7.954740e+04
5Angola6.1230001.119365e+11254.7479702.447546NaN501.252.693754e+07
6Antigua and Barbuda2.0820001.298213e+091152.4936563.67651448.291463NaN9.887240e+04
7Arab World3.3975872.709059e+12761.4017272.87384047.119776161.003.899620e+08
8Argentina2.3280005.509810e+111148.2561422.78221648.91581053.754.297667e+07
9Armenia1.5455001.088536e+10348.6638841.91601646.78218027.252.904683e+06
10Aruba1.663250NaNNaNNaN48.721939NaN1.037444e+05
11Australia1.8615001.422994e+124256.0589886.29238148.5767076.002.344456e+07
12Austria1.4550004.074943e+114930.2988938.50427648.5560784.008.566294e+06
13Azerbaijan1.9800006.200300e+10956.7097181.19724946.15736325.259.531856e+06
14Bahamas, The1.8772508.688000e+091727.1283853.308626NaN81.503.819036e+05
15Bahrain2.0652503.200401e+102030.1583162.97638649.11683815.251.349810e+06
16Bangladesh2.1932501.745451e+1185.9688440.86044750.460564194.751.593712e+08
17Barbados1.7922504.413080e+091062.8400884.82868048.87818128.002.833384e+05
18Belarus1.6770006.478294e+10986.2367573.87660148.6857414.009.480348e+06
19Belgium1.7550004.942218e+114297.8380058.22100348.8646757.001.122850e+07
20Belize2.5947501.680325e+09471.9674653.74484448.31723829.253.517636e+05
21Benin4.8067508.778151e+0983.7261902.20691647.211127417.501.029371e+07
22Bermuda1.6175005.555624e+09NaNNaN48.423588NaN6.510080e+04
23Bhutan2.0612501.975145e+09277.5266702.70690849.572296161.757.759054e+05
24Bolivia2.9952503.150932e+10381.0075944.19203148.464175218.251.056280e+07
25Bosnia and Herzegovina1.2670001.732333e+10941.5046556.84102148.63490511.753.574396e+06
26Botswana2.8450001.511339e+10880.9092023.55207148.844009138.752.169170e+06
27Brazil1.7952502.198766e+121303.1991043.77347347.78457749.502.041595e+08
28British Virgin IslandsNaNNaNNaNNaN47.581520NaN2.958540e+04
29Brunei Darussalam1.8840001.571922e+101795.9241602.33519448.52369923.754.115812e+05
...........................
233Syrian Arab Republic2.967750NaN269.9457391.50716648.04739462.001.931967e+07
234Tajikistan3.4957508.036228e+09169.7459701.97636748.26068033.258.363844e+06
235Tanzania5.1812504.493554e+10131.7041622.64860950.666580429.505.228132e+07
236Thailand1.5167504.061369e+11581.9274873.18384248.21303421.006.838499e+07
237Timor-Leste5.7977501.361430e+0998.5772961.14044048.337367240.251.212718e+06
238Togo4.6200004.183610e+0971.2638252.03780948.270471380.757.230904e+06
239Tonga3.7457504.391789e+08250.9625043.98728547.697931129.251.059094e+05
240Trinidad and Tobago1.7827502.457095e+101778.1480733.071370NaN63.251.353877e+06
241Tunisia2.1400004.482437e+10782.9505224.11877148.14213263.251.114441e+07
242Turkey2.0780008.951756e+11997.3747724.18952148.78947717.507.703435e+07
243Turkmenistan2.3137503.797310e+10288.5726441.34930348.90687943.505.465637e+06
244Turks and Caicos IslandsNaNNaNNaNNaN48.846884NaN3.370340e+04
245TuvaluNaN3.646999e+07563.50059215.50692947.472414NaN1.091000e+04
246Uganda5.8225002.594146e+10132.8926842.01434950.099485366.503.886534e+07
247Ukraine1.5102501.353793e+11628.5792543.96018548.98419824.254.530270e+07
248United Arab Emirates1.7930003.750271e+112202.4075692.58116848.7892606.009.080299e+06
249United Kingdom1.8425002.768864e+123357.9836757.72065548.7918099.256.464156e+07
250United States1.8608751.736912e+139060.0686578.12196148.75883014.003.185582e+08
251Upper middle income1.7952442.097441e+13870.8975123.35815347.11200143.252.540966e+09
252Uruguay2.0270005.434513e+101721.5077526.04440348.29555515.503.419977e+06
253Uzbekistan2.3727506.134065e+10334.4767543.11884248.38743437.003.078450e+07
254Vanuatu3.3647507.828760e+08125.5687123.68987447.30161782.502.588964e+05
255Venezuela, RB2.3782503.761463e+11896.8153141.58708848.40093497.003.073452e+07
256Vietnam1.9595001.818207e+11368.3745503.77950148.02105354.759.074240e+07
257Virgin Islands (U.S.)1.7600003.812000e+09NaNNaNNaNNaN1.041414e+05
258West Bank and Gaza4.2080001.250822e+10NaNNaN48.82852047.504.296960e+06
259World2.4642827.613006e+131223.9412435.94705848.076575223.757.269321e+09
260Yemen, Rep.4.2257503.681934e+10207.9497001.41783644.470076399.752.624661e+07
261Zambia5.3942502.428099e+10185.5563592.68729049.934484233.751.563322e+07
262Zimbabwe3.9430001.549551e+10115.5198812.69518849.529875398.001.542096e+07
\n", "

263 rows × 8 columns

\n", "
" ], "text/plain": [ " country fert_rate gdp health_exp_per_cap \\\n", "0 Afghanistan 4.954500 1.996102e+10 161.138034 \n", "1 Albania 1.769250 1.232759e+10 574.202694 \n", "2 Algeria 2.866000 1.907346e+11 870.766508 \n", "3 American Samoa NaN 6.405000e+08 NaN \n", "4 Andorra NaN 3.197538e+09 4421.224933 \n", "5 Angola 6.123000 1.119365e+11 254.747970 \n", "6 Antigua and Barbuda 2.082000 1.298213e+09 1152.493656 \n", "7 Arab World 3.397587 2.709059e+12 761.401727 \n", "8 Argentina 2.328000 5.509810e+11 1148.256142 \n", "9 Armenia 1.545500 1.088536e+10 348.663884 \n", "10 Aruba 1.663250 NaN NaN \n", "11 Australia 1.861500 1.422994e+12 4256.058988 \n", "12 Austria 1.455000 4.074943e+11 4930.298893 \n", "13 Azerbaijan 1.980000 6.200300e+10 956.709718 \n", "14 Bahamas, The 1.877250 8.688000e+09 1727.128385 \n", "15 Bahrain 2.065250 3.200401e+10 2030.158316 \n", "16 Bangladesh 2.193250 1.745451e+11 85.968844 \n", "17 Barbados 1.792250 4.413080e+09 1062.840088 \n", "18 Belarus 1.677000 6.478294e+10 986.236757 \n", "19 Belgium 1.755000 4.942218e+11 4297.838005 \n", "20 Belize 2.594750 1.680325e+09 471.967465 \n", "21 Benin 4.806750 8.778151e+09 83.726190 \n", "22 Bermuda 1.617500 5.555624e+09 NaN \n", "23 Bhutan 2.061250 1.975145e+09 277.526670 \n", "24 Bolivia 2.995250 3.150932e+10 381.007594 \n", "25 Bosnia and Herzegovina 1.267000 1.732333e+10 941.504655 \n", "26 Botswana 2.845000 1.511339e+10 880.909202 \n", "27 Brazil 1.795250 2.198766e+12 1303.199104 \n", "28 British Virgin Islands NaN NaN NaN \n", "29 Brunei Darussalam 1.884000 1.571922e+10 1795.924160 \n", ".. ... ... ... ... \n", "233 Syrian Arab Republic 2.967750 NaN 269.945739 \n", "234 Tajikistan 3.495750 8.036228e+09 169.745970 \n", "235 Tanzania 5.181250 4.493554e+10 131.704162 \n", "236 Thailand 1.516750 4.061369e+11 581.927487 \n", "237 Timor-Leste 5.797750 1.361430e+09 98.577296 \n", "238 Togo 4.620000 4.183610e+09 71.263825 \n", "239 Tonga 3.745750 4.391789e+08 250.962504 \n", "240 Trinidad and Tobago 1.782750 2.457095e+10 1778.148073 \n", "241 Tunisia 2.140000 4.482437e+10 782.950522 \n", "242 Turkey 2.078000 8.951756e+11 997.374772 \n", "243 Turkmenistan 2.313750 3.797310e+10 288.572644 \n", "244 Turks and Caicos Islands NaN NaN NaN \n", "245 Tuvalu NaN 3.646999e+07 563.500592 \n", "246 Uganda 5.822500 2.594146e+10 132.892684 \n", "247 Ukraine 1.510250 1.353793e+11 628.579254 \n", "248 United Arab Emirates 1.793000 3.750271e+11 2202.407569 \n", "249 United Kingdom 1.842500 2.768864e+12 3357.983675 \n", "250 United States 1.860875 1.736912e+13 9060.068657 \n", "251 Upper middle income 1.795244 2.097441e+13 870.897512 \n", "252 Uruguay 2.027000 5.434513e+10 1721.507752 \n", "253 Uzbekistan 2.372750 6.134065e+10 334.476754 \n", "254 Vanuatu 3.364750 7.828760e+08 125.568712 \n", "255 Venezuela, RB 2.378250 3.761463e+11 896.815314 \n", "256 Vietnam 1.959500 1.818207e+11 368.374550 \n", "257 Virgin Islands (U.S.) 1.760000 3.812000e+09 NaN \n", "258 West Bank and Gaza 4.208000 1.250822e+10 NaN \n", "259 World 2.464282 7.613006e+13 1223.941243 \n", "260 Yemen, Rep. 4.225750 3.681934e+10 207.949700 \n", "261 Zambia 5.394250 2.428099e+10 185.556359 \n", "262 Zimbabwe 3.943000 1.549551e+10 115.519881 \n", "\n", " health_exp_pub prim_ed_girls mat_mort_ratio population \n", "0 2.834598 40.109708 444.00 3.271584e+07 \n", "1 2.836021 47.201082 29.25 2.888280e+06 \n", "2 4.984252 47.675617 142.50 3.909906e+07 \n", "3 NaN NaN NaN 5.542200e+04 \n", "4 7.260281 47.123345 NaN 7.954740e+04 \n", "5 2.447546 NaN 501.25 2.693754e+07 \n", "6 3.676514 48.291463 NaN 9.887240e+04 \n", "7 2.873840 47.119776 161.00 3.899620e+08 \n", "8 2.782216 48.915810 53.75 4.297667e+07 \n", "9 1.916016 46.782180 27.25 2.904683e+06 \n", "10 NaN 48.721939 NaN 1.037444e+05 \n", "11 6.292381 48.576707 6.00 2.344456e+07 \n", "12 8.504276 48.556078 4.00 8.566294e+06 \n", "13 1.197249 46.157363 25.25 9.531856e+06 \n", "14 3.308626 NaN 81.50 3.819036e+05 \n", "15 2.976386 49.116838 15.25 1.349810e+06 \n", "16 0.860447 50.460564 194.75 1.593712e+08 \n", "17 4.828680 48.878181 28.00 2.833384e+05 \n", "18 3.876601 48.685741 4.00 9.480348e+06 \n", "19 8.221003 48.864675 7.00 1.122850e+07 \n", "20 3.744844 48.317238 29.25 3.517636e+05 \n", "21 2.206916 47.211127 417.50 1.029371e+07 \n", "22 NaN 48.423588 NaN 6.510080e+04 \n", "23 2.706908 49.572296 161.75 7.759054e+05 \n", "24 4.192031 48.464175 218.25 1.056280e+07 \n", "25 6.841021 48.634905 11.75 3.574396e+06 \n", "26 3.552071 48.844009 138.75 2.169170e+06 \n", "27 3.773473 47.784577 49.50 2.041595e+08 \n", "28 NaN 47.581520 NaN 2.958540e+04 \n", "29 2.335194 48.523699 23.75 4.115812e+05 \n", ".. ... ... ... ... \n", "233 1.507166 48.047394 62.00 1.931967e+07 \n", "234 1.976367 48.260680 33.25 8.363844e+06 \n", "235 2.648609 50.666580 429.50 5.228132e+07 \n", "236 3.183842 48.213034 21.00 6.838499e+07 \n", "237 1.140440 48.337367 240.25 1.212718e+06 \n", "238 2.037809 48.270471 380.75 7.230904e+06 \n", "239 3.987285 47.697931 129.25 1.059094e+05 \n", "240 3.071370 NaN 63.25 1.353877e+06 \n", "241 4.118771 48.142132 63.25 1.114441e+07 \n", "242 4.189521 48.789477 17.50 7.703435e+07 \n", "243 1.349303 48.906879 43.50 5.465637e+06 \n", "244 NaN 48.846884 NaN 3.370340e+04 \n", "245 15.506929 47.472414 NaN 1.091000e+04 \n", "246 2.014349 50.099485 366.50 3.886534e+07 \n", "247 3.960185 48.984198 24.25 4.530270e+07 \n", "248 2.581168 48.789260 6.00 9.080299e+06 \n", "249 7.720655 48.791809 9.25 6.464156e+07 \n", "250 8.121961 48.758830 14.00 3.185582e+08 \n", "251 3.358153 47.112001 43.25 2.540966e+09 \n", "252 6.044403 48.295555 15.50 3.419977e+06 \n", "253 3.118842 48.387434 37.00 3.078450e+07 \n", "254 3.689874 47.301617 82.50 2.588964e+05 \n", "255 1.587088 48.400934 97.00 3.073452e+07 \n", "256 3.779501 48.021053 54.75 9.074240e+07 \n", "257 NaN NaN NaN 1.041414e+05 \n", "258 NaN 48.828520 47.50 4.296960e+06 \n", "259 5.947058 48.076575 223.75 7.269321e+09 \n", "260 1.417836 44.470076 399.75 2.624661e+07 \n", "261 2.687290 49.934484 233.75 1.563322e+07 \n", "262 2.695188 49.529875 398.00 1.542096e+07 \n", "\n", "[263 rows x 8 columns]" ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "gender_data" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The data frame has rows and columns. Like other Python objects, it has *attributes*. These are pieces of data associated with the data frame. You have already seen *methods*, which are functions associated with the data frame. You can access attributes in the same way as you access methods, by typing the variable name, followed by a dot `.`, followed by the attribute name.\n", "\n", "For example, one attribute of the data frame, is the `shape`:" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "(263, 8)" ] }, "execution_count": 6, "metadata": {}, "output_type": "execute_result" } ], "source": [ "gender_data.shape" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Another attribute is `columns`. This has the column names. For\n", "example, here is a good way of quickly seeing the column names\n", "for a data frame:" ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "Index(['country', 'fert_rate', 'gdp', 'health_exp_per_cap', 'health_exp_pub',\n", " 'prim_ed_girls', 'mat_mort_ratio', 'population'],\n", " dtype='object')" ] }, "execution_count": 7, "metadata": {}, "output_type": "execute_result" } ], "source": [ "gender_data.columns" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "You need more information about what these column names refer to. Here are the longer descriptions from the original data source (link above):\n", "\n", "* `fert_rate`: Fertility rate, total (births per woman).\n", "* `gdp`: GDP (current US\\$).\n", "* `health_exp_per_cap`: Health expenditure per capita, PPP (constant 2011 international \\$).\n", "* `health_exp_pub`: Health expenditure, public (% of GDP).\n", "* `prim_ed_girls`: Primary education, pupils (% female).\n", "* `mat_mort_ratio`: Maternal mortality ratio (modeled estimate, per 100,000 live births).\n", "* `population`: Population, total.\n", "\n", "You have just seen array slicing (in [Selecting with\n", "arrays](../03/array_indexing). You remember that array slicing\n", "uses square brackets. Data frames also allow slicing. For\n", "example, we often want to get all the data for a single column\n", "of the data frame. To do this, we use the same square bracket\n", "notation as we use for array slicing, with the name of the\n", "column inside the square brackets." ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [], "source": [ "gdp = gender_data['gdp']" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "What `type` of thing is this column of data?" ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "pandas.core.series.Series" ] }, "execution_count": 9, "metadata": {}, "output_type": "execute_result" } ], "source": [ "type(gdp)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Here are the values for `gdp`. You will notice that these are\n", "the same values you saw in the \"gdp\" column when you displayed\n", "the whole data frame." ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "0 1.996102e+10\n", "1 1.232759e+10\n", "2 1.907346e+11\n", "3 6.405000e+08\n", "4 3.197538e+09\n", "5 1.119365e+11\n", "6 1.298213e+09\n", "7 2.709059e+12\n", "8 5.509810e+11\n", "9 1.088536e+10\n", "10 NaN\n", "11 1.422994e+12\n", "12 4.074943e+11\n", "13 6.200300e+10\n", "14 8.688000e+09\n", "15 3.200401e+10\n", "16 1.745451e+11\n", "17 4.413080e+09\n", "18 6.478294e+10\n", "19 4.942218e+11\n", "20 1.680325e+09\n", "21 8.778151e+09\n", "22 5.555624e+09\n", "23 1.975145e+09\n", "24 3.150932e+10\n", "25 1.732333e+10\n", "26 1.511339e+10\n", "27 2.198766e+12\n", "28 NaN\n", "29 1.571922e+10\n", " ... \n", "233 NaN\n", "234 8.036228e+09\n", "235 4.493554e+10\n", "236 4.061369e+11\n", "237 1.361430e+09\n", "238 4.183610e+09\n", "239 4.391789e+08\n", "240 2.457095e+10\n", "241 4.482437e+10\n", "242 8.951756e+11\n", "243 3.797310e+10\n", "244 NaN\n", "245 3.646999e+07\n", "246 2.594146e+10\n", "247 1.353793e+11\n", "248 3.750271e+11\n", "249 2.768864e+12\n", "250 1.736912e+13\n", "251 2.097441e+13\n", "252 5.434513e+10\n", "253 6.134065e+10\n", "254 7.828760e+08\n", "255 3.761463e+11\n", "256 1.818207e+11\n", "257 3.812000e+09\n", "258 1.250822e+10\n", "259 7.613006e+13\n", "260 3.681934e+10\n", "261 2.428099e+10\n", "262 1.549551e+10\n", "Name: gdp, Length: 263, dtype: float64" ] }, "execution_count": 10, "metadata": {}, "output_type": "execute_result" } ], "source": [ "gdp" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "What are these values like `6.405000e+08`? These are numbers,\n", "in [scientific\n", "notation](https://en.wikipedia.org/wiki/Scientific_notation).\n", "Scientific notation is a compact way of writing very large or\n", "very small numbers. The value after `e` above is the\n", "*exponent*, in this case `08`. The number above means $6.405\n", "* 10^8$. For example, here is $2 * 10^7$:" ] }, { "cell_type": "code", "execution_count": 11, "metadata": { "lines_to_next_cell": 2 }, "outputs": [ { "data": { "text/plain": [ "20000000.0" ] }, "execution_count": 11, "metadata": {}, "output_type": "execute_result" } ], "source": [ "2e7" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Missing values and `NaN`\n", "\n", "Looking at the values of `gdp` (and therefore, the values of the\n", "`gdp` column of `gender_data`, we see that some of the values\n", "are `NaN`, which means Not a Number. Pandas uses this marker to\n", "indicate values that are not available, or *missing data*.\n", "\n", "Numpy does not like to calculate with `NaN` values. Here is Numpy trying to calculate the median of the `gdp` values." ] }, { "cell_type": "code", "execution_count": 12, "metadata": {}, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "/Users/mb312/.virtualenvs/dsfe/lib/python3.7/site-packages/numpy/lib/function_base.py:3400: RuntimeWarning: Invalid value encountered in median\n", " r = func(a, **kwargs)\n" ] }, { "data": { "text/plain": [ "nan" ] }, "execution_count": 12, "metadata": {}, "output_type": "execute_result" } ], "source": [ "np.median(gdp)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Notice the warning about an invalid value.\n", "\n", "Numpy recognizes that one or more values are `NaN` and refuses to guess what to do, when calculating the median." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "You saw from the shape above that `gender_data` has 263 rows. We can use the general Python `len` function, to see how many elements there are in `gdp`." ] }, { "cell_type": "code", "execution_count": 13, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "263" ] }, "execution_count": 13, "metadata": {}, "output_type": "execute_result" } ], "source": [ "len(gdp)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "As expected, it has the same number of elements as there are rows in `gender_data`.\n", "\n", "The `count` method of the series gives the number of values that are not missing - that is - not `NaN`." ] }, { "cell_type": "code", "execution_count": 14, "metadata": { "lines_to_next_cell": 2 }, "outputs": [ { "data": { "text/plain": [ "246" ] }, "execution_count": 14, "metadata": {}, "output_type": "execute_result" } ], "source": [ "gdp.count()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Plotting with methods" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The `gdp` variable is a sequence of values, so we can do a histogram on these values, as we have done histograms on arrays." ] }, { "cell_type": "code", "execution_count": 15, "metadata": {}, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "/Users/mb312/.virtualenvs/dsfe/lib/python3.7/site-packages/numpy/lib/histograms.py:824: RuntimeWarning: invalid value encountered in greater_equal\n", " keep = (tmp_a >= first_edge)\n", "/Users/mb312/.virtualenvs/dsfe/lib/python3.7/site-packages/numpy/lib/histograms.py:825: RuntimeWarning: invalid value encountered in less_equal\n", " keep &= (tmp_a <= last_edge)\n" ] }, { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAAaAAAAECCAYAAAC44gO8AAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDMuMC4yLCBodHRwOi8vbWF0cGxvdGxpYi5vcmcvOIA7rQAAD09JREFUeJzt3X2MZXV9x/H3R9anRV0wtpt1lxRiJrbUVKRAqVJDS1WwRmhSCaRVQkm0CRrtQxr0H20aU5sY25pWowVkjYpFlLox1EqoqW5SkIrIo/ZuBWTXldUqo5ZGu/bbP+bs5oq7c+f5O3d4v5LJnPO759zzORt2P3PO/XEmVYUkSWvtCd0BJEmPTxaQJKmFBSRJamEBSZJaWECSpBabOg46Ozvr1DtJ2uC2bNmS+V73CkiS1MICkiS1mOoCGo1G3REWzcyrb9rygpnXipnXl6kuIEnS9LKAJEktLCBJUgsLSJLUwgKSJLWwgCRJLSwgSVKLlkfxrJTTd2+G3fu6Y/DIpdu7I0jS1PEKSJLUwgKSJLWwgCRJLSwgSVILC0iS1MICkiS1sIAkSS0sIElSCwtIktTCApIktbCAJEktLCBJUgsLSJLUwgKSJLWwgCRJLSwgSVILC0iS1MICkiS1sIAkSS0mFlCSE5J8Nsm9Se5J8sZh/JlJbkoyGr4fP4wnybuT7ElyZ5JTV/skJEnTZyFXQAeBP66qk4EzgcuTnAxcAdxcVTPAzcM6wHnAzPD1WuC9K55akjT1JhZQVe2vqtuH5e8D9wHbgfOBncNmO4ELhuXzgQ/WnFuA45JsW/HkkqSptqjPgJKcCLwAuBXYWlX7h5e+CWwdlrcDD43ttncYkyTpsE0L3TDJ04CPA2+qqu8lOfxaVVWSWkqA0Wi0lN0Gm5ex78pZ7Dks75x7TFvmacsLZl4rZl5dMzMzC952QQWU5InMlc+Hq+oTw/DDSbZV1f7hFtuBYXwfcMLY7juGsWWH/Sm7j/q2a2ox5zAajZZ3zg2mLfO05QUzrxUzry8LmQUX4Crgvqp619hLu4BLhuVLgE+Ojb9mmA13JjA7dqtOkiRgYVdALwJeDdyV5I5h7C3AO4DrklwGPAhcOLx2I/ByYA/wKHDpiiaWJG0IEwuoqnYDOcrL5xxh+wIuX2YuSdIG55MQJEktLCBJUgsLSJLUwgKSJLWwgCRJLSwgSVILC0iS1MICkiS1sIAkSS0sIElSCwtIktTCApIktbCAJEktLCBJUgsLSJLUwgKSJLWwgCRJLSwgSVILC0iS1MICkiS1sIAkSS0sIElSCwtIktTCApIktbCAJEktLCBJUgsLSJLUwgKSJLWwgCRJLSwgSVILC0iS1MICkiS1sIAkSS0sIElSCwtIktTCApIktbCAJEktJhZQkquTHEhy99jY25LsS3LH8PXysdfenGRPkq8medlqBZckTbeFXAFdA5x7hPG/qqpThq8bAZKcDFwE/OKwz3uSHLNSYSVJG8fEAqqqzwHfWeD7nQ98tKp+WFX3A3uAM5aRT5K0QS3nM6DXJ7lzuEV3/DC2HXhobJu9w5gkST8hVTV5o+RE4FNV9bxhfSvwbaCAPwe2VdXvJ/lb4Jaq+tCw3VXAP1XV9ePvNzs7e/igo9FoyeFP3715yfuupNvOerQ7giStCzMzM4eXt2zZkvm23bSUA1TVw4eWk/w98KlhdR9wwtimO4axoxoPu2i7533rNbOYcxiNRss75wbTlnna8oKZ14qZ15cl3YJLsm1s9beBQzPkdgEXJXlykpOAGeALy4soSdqIJl4BJbkWOBt4VpK9wFuBs5OcwtwtuAeA1wFU1T1JrgPuBQ4Cl1fVj1cnuiRpmk0soKq6+AjDV82z/duBty8nlCRp4/NJCJKkFhaQJKmFBSRJamEBSZJaWECSpBYWkCSphQUkSWphAUmSWlhAkqQWFpAkqYUFJElqYQFJklpYQJKkFhaQJKmFBSRJamEBSZJaWECSpBYWkCSphQUkSWphAUmSWlhAkqQWFpAkqYUFJElqYQFJklpYQJKkFhaQJKmFBSRJamEBSZJaWECSpBYWkCSphQUkSWphAUmSWlhAkqQWFpAkqYUFJElqYQFJklpYQJKkFhMLKMnVSQ4kuXts7JlJbkoyGr4fP4wnybuT7ElyZ5JTVzO8JGl6LeQK6Brg3MeMXQHcXFUzwM3DOsB5wMzw9VrgvSsTU5K00UwsoKr6HPCdxwyfD+wclncCF4yNf7Dm3AIcl2TbSoWVJG0cS/0MaGtV7R+WvwlsHZa3Aw+Nbbd3GJMk6SdsWu4bVFUlqaXuPxqNlnH0zcvYd+Us9hyWd849pi3ztOUFM68VM6+umZmZBW+71AJ6OMm2qto/3GI7MIzvA04Y227HMHZUiwn7U3bP+9ZrZjHnMBqNlnfODaYt87TlBTOvFTOvL0u9BbcLuGRYvgT45Nj4a4bZcGcCs2O36iRJOmziFVCSa4GzgWcl2Qu8FXgHcF2Sy4AHgQuHzW8EXg7sAR4FLl2FzJKkDWBiAVXVxUd56ZwjbFvA5csNJUna+HwSgiSphQUkSWphAUmSWlhAkqQWFpAkqYUFJElqYQFJklpYQJKkFhaQJKmFBSRJamEBSZJaWECSpBYWkCSphQUkSWphAUmSWlhAkqQWFpAkqYUFJElqYQFJklpYQJKkFhaQJKmFBSRJamEBSZJaWECSpBYWkCSphQUkSWphAUmSWlhAkqQWFpAkqYUFJElqYQFJklpYQJKkFhaQJKmFBSRJamEBSZJaWECSpBYWkCSpxabl7JzkAeD7wI+Bg1V1WpJnAv8AnAg8AFxYVd9dXkxJ0kazEldAv15Vp1TVacP6FcDNVTUD3DysS5L0E1bjFtz5wM5heSdwwSocQ5I05VJVS985uR/4LlDA+6rq/UkeqarjhtcDfPfQ+iGzs7OHDzoajZZ8/NN3b17yvivptrMe7Y4gSevCzMzM4eUtW7Zkvm2X9RkQcFZV7Uvys8BNSb4y/mJVVZJ5G2487KLt3rf0fVfQYs5hNBot75wbTFvmacsLZl4rZl5flnULrqr2Dd8PADcAZwAPJ9kGMHw/sNyQkqSNZ8kFlOTYJE8/tAy8FLgb2AVcMmx2CfDJ5YaUJG08y7kFtxW4Ye5jHjYBH6mqTye5DbguyWXAg8CFy48pSdpollxAVfU14PlHGP8v4JzlhJIkbXw+CUGS1MICkiS1sIAkSS0sIElSCwtIktTCApIktbCAJEktLCBJUgsLSJLUwgKSJLWwgCRJLSwgSVILC0iS1MICkiS1sIAkSS0sIElSCwtIktTCApIktbCAJEktLCBJUgsLSJLUwgKSJLWwgCRJLSwgSVILC0iS1MICkiS1sIAkSS0sIElSCwtIktTCApIktbCAJEktLCBJUgsLSJLUwgKSJLXY1B1AK+e4D+xbpXfeDLsX/t6PXLp9lXJI2ki8ApIktfAKaAUs7spjcVcTkrRRrVoBJTkX+BvgGODKqnrHah1L68vq3QpcqLmS91agtL6tSgElOQb4O+AlwF7gtiS7qure1TiepIVZ2R8Oln417w8HAkhVrfybJr8KvK2qXjasvxmgqv4CYHZ2duUPKklaV7Zs2ZL5Xl+tSQjbgYfG1vcOY5IkAc6CkyQ1Wa1JCPuAE8bWdwxjwOTLMknSxrdaV0C3ATNJTkryJOAiYNcqHUuSNIVWpYCq6iDweuCfgfuA66rqnpV6/yTnJvlqkj1Jrlip911NSa5OciDJ3d1ZFiLJCUk+m+TeJPckeWN3pkmSPCXJF5J8ecj8Z92ZFirJMUm+lORT3VkWIskDSe5KckeSf+/OsxBJjktyfZKvJLlvmCy1biV57vDne+jre0ne1J1rPkn+cPi7d3eSa5M8Zd7tV2MW3Goapnj/B2NTvIGL1/sU7yQvBn4AfLCqntedZ5Ik24BtVXV7kqcDXwQuWM9/zkkCHFtVP0jyRGA38MaquqU52kRJ/gg4DXhGVb2iO88kSR4ATquqb3dnWagkO4HPV9WVw52ZzVX1SHeuhRj+3dsH/EpVPdid50iSbGfu79zJVfU/Sa4Dbqyqa462zzROQjgD2FNVX6uqHwEfBc5vzjRRVX0O+E53joWqqv1Vdfuw/H3mrmTX9UzGmvODYfWJw9e6/wkryQ7gt4Aru7NsVEm2AC8GrgKoqh9NS/kMzgH+c72Wz5hNwFOTbAI2A9+Yb+NpLCCneK+xJCcCLwBu7U0y2XAr6w7gAHBTVa37zMBfA38K/F93kEUo4DNJvpjktd1hFuAk4FvAB4ZbnVcmObY71CJcBFzbHWI+VbUPeCfwdWA/MFtVn5lvn2ksIK2hJE8DPg68qaq+151nkqr6cVWdwtzMyzOSrOvbnUleARyoqi92Z1mks6rqVOA84PLhFvN6tgk4FXhvVb0A+G9gWj4/fhLwSuBj3Vnmk+R45u5GnQQ8Gzg2ye/Nt880FtC8U7y1cobPUT4OfLiqPtGdZzGG2yufBc7tzjLBi4BXDp+pfBT4jSQf6o002fDTLlV1ALiBuVvj69leYO/YFfH1zBXSNDgPuL2qHu4OMsFvAvdX1beq6n+BTwAvnG+HaSwgp3ivgeED/auA+6rqXd15FiLJzyQ5blh+KnMTVb7Sm2p+VfXmqtpRVScy99/yv1TVvD81dkty7DAxheE21kuBdT27s6q+CTyU5LnD0DnAup1Q8xgXs85vvw2+DpyZZPPw78c5zH12fFRT9+sYqupgkkNTvI8Brl7JKd6rJcm1wNnAs5LsBd5aVVf1pprXi4BXA3cNn6kAvKWqbmzMNMk2YOcwY+gJzE3/n4ppzVNmK3DD3L8xbAI+UlWf7o20IG8APjz84Po14NLmPBMNBf8S4HXdWSapqluTXA/cDhwEvgS8f759pm4atiRpY5jGW3CSpA3AApIktbCAJEktLCBJUgsLSJIeZxbzcOQkL05ye5KDSX5nbPznhvE7hgeQ/sGiczgLTpIeXxbzcOThUVzPAP4E2FVV1w/jT2KuQ344PDHlbuCFVTXv89/GeQUkSY8zR3o4cpLnJPn08Hy/zyf5+WHbB6rqTh7zrMLhga4/HFafzBL6xAKSJMHc/zT6hqr6Zeaudt4zaYfh94bdydwDov9yMVc/MIVPQpAkrazhFtoLgY8NT7iAuauaeVXVQ8AvJXk28I9Jrl/MM+ssIEnSE4BHhifJL1pVfWOY0PBrzD3odcEHlSQ9jg2/auX+JK+CuYcRJ3n+fPsk2TE89PfQr2I4C/jqYo5rAUnS48zwcOR/A56bZG+Sy4DfBS5L8mXgHobfNJ3k9OEByq8C3pfk0MOffwG4ddj+X4F3VtVdi8rhNGxJUgevgCRJLSwgSVILC0iS1MICkiS1sIAkSS0sIElSCwtIktTCApIktfh/ATIiA2z/FR0AAAAASUVORK5CYII=\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "plt.hist(gdp);" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Notice the multiple warnings as Matplotlib tried to calculate the bin widths for the histogram. These are from the `NaN` values." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Another way to do the histogram, is to use the `hist` *method* of the series. \n", "\n", "A method is a function attached to a value. In this case `hist` is a function attached to a value of type `Series`.\n", "\n", "Using the `hist` method instead of the `plt.hist` function can make the code a bit easier to read. The method also has the advantage that it discards the `NaN` values, by default, so it does not generate the same warnings." ] }, { "cell_type": "code", "execution_count": 16, "metadata": {}, "outputs": [ { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAAaAAAAECCAYAAAC44gO8AAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDMuMC4yLCBodHRwOi8vbWF0cGxvdGxpYi5vcmcvOIA7rQAAD09JREFUeJzt3X2MZXV9x/H3R9anRV0wtpt1lxRiJrbUVKRAqVJDS1WwRmhSCaRVQkm0CRrtQxr0H20aU5sY25pWowVkjYpFlLox1EqoqW5SkIrIo/ZuBWTXldUqo5ZGu/bbP+bs5oq7c+f5O3d4v5LJnPO759zzORt2P3PO/XEmVYUkSWvtCd0BJEmPTxaQJKmFBSRJamEBSZJaWECSpBabOg46Ozvr1DtJ2uC2bNmS+V73CkiS1MICkiS1mOoCGo1G3REWzcyrb9rygpnXipnXl6kuIEnS9LKAJEktLCBJUgsLSJLUwgKSJLWwgCRJLSwgSVKLlkfxrJTTd2+G3fu6Y/DIpdu7I0jS1PEKSJLUwgKSJLWwgCRJLSwgSVILC0iS1MICkiS1sIAkSS0sIElSCwtIktTCApIktbCAJEktLCBJUgsLSJLUwgKSJLWwgCRJLSwgSVILC0iS1MICkiS1sIAkSS0mFlCSE5J8Nsm9Se5J8sZh/JlJbkoyGr4fP4wnybuT7ElyZ5JTV/skJEnTZyFXQAeBP66qk4EzgcuTnAxcAdxcVTPAzcM6wHnAzPD1WuC9K55akjT1JhZQVe2vqtuH5e8D9wHbgfOBncNmO4ELhuXzgQ/WnFuA45JsW/HkkqSptqjPgJKcCLwAuBXYWlX7h5e+CWwdlrcDD43ttncYkyTpsE0L3TDJ04CPA2+qqu8lOfxaVVWSWkqA0Wi0lN0Gm5ex78pZ7Dks75x7TFvmacsLZl4rZl5dMzMzC952QQWU5InMlc+Hq+oTw/DDSbZV1f7hFtuBYXwfcMLY7juGsWWH/Sm7j/q2a2ox5zAajZZ3zg2mLfO05QUzrxUzry8LmQUX4Crgvqp619hLu4BLhuVLgE+Ojb9mmA13JjA7dqtOkiRgYVdALwJeDdyV5I5h7C3AO4DrklwGPAhcOLx2I/ByYA/wKHDpiiaWJG0IEwuoqnYDOcrL5xxh+wIuX2YuSdIG55MQJEktLCBJUgsLSJLUwgKSJLWwgCRJLSwgSVILC0iS1MICkiS1sIAkSS0sIElSCwtIktTCApIktbCAJEktLCBJUgsLSJLUwgKSJLWwgCRJLSwgSVILC0iS1MICkiS1sIAkSS0sIElSCwtIktTCApIktbCAJEktLCBJUgsLSJLUwgKSJLWwgCRJLSwgSVILC0iS1MICkiS1sIAkSS0sIElSCwtIktTCApIktbCAJEktJhZQkquTHEhy99jY25LsS3LH8PXysdfenGRPkq8medlqBZckTbeFXAFdA5x7hPG/qqpThq8bAZKcDFwE/OKwz3uSHLNSYSVJG8fEAqqqzwHfWeD7nQ98tKp+WFX3A3uAM5aRT5K0QS3nM6DXJ7lzuEV3/DC2HXhobJu9w5gkST8hVTV5o+RE4FNV9bxhfSvwbaCAPwe2VdXvJ/lb4Jaq+tCw3VXAP1XV9ePvNzs7e/igo9FoyeFP3715yfuupNvOerQ7giStCzMzM4eXt2zZkvm23bSUA1TVw4eWk/w98KlhdR9wwtimO4axoxoPu2i7533rNbOYcxiNRss75wbTlnna8oKZ14qZ15cl3YJLsm1s9beBQzPkdgEXJXlykpOAGeALy4soSdqIJl4BJbkWOBt4VpK9wFuBs5OcwtwtuAeA1wFU1T1JrgPuBQ4Cl1fVj1cnuiRpmk0soKq6+AjDV82z/duBty8nlCRp4/NJCJKkFhaQJKmFBSRJamEBSZJaWECSpBYWkCSphQUkSWphAUmSWlhAkqQWFpAkqYUFJElqYQFJklpYQJKkFhaQJKmFBSRJamEBSZJaWECSpBYWkCSphQUkSWphAUmSWlhAkqQWFpAkqYUFJElqYQFJklpYQJKkFhaQJKmFBSRJamEBSZJaWECSpBYWkCSphQUkSWphAUmSWlhAkqQWFpAkqYUFJElqYQFJklpYQJKkFhMLKMnVSQ4kuXts7JlJbkoyGr4fP4wnybuT7ElyZ5JTVzO8JGl6LeQK6Brg3MeMXQHcXFUzwM3DOsB5wMzw9VrgvSsTU5K00UwsoKr6HPCdxwyfD+wclncCF4yNf7Dm3AIcl2TbSoWVJG0cS/0MaGtV7R+WvwlsHZa3Aw+Nbbd3GJMk6SdsWu4bVFUlqaXuPxqNlnH0zcvYd+Us9hyWd849pi3ztOUFM68VM6+umZmZBW+71AJ6OMm2qto/3GI7MIzvA04Y227HMHZUiwn7U3bP+9ZrZjHnMBqNlnfODaYt87TlBTOvFTOvL0u9BbcLuGRYvgT45Nj4a4bZcGcCs2O36iRJOmziFVCSa4GzgWcl2Qu8FXgHcF2Sy4AHgQuHzW8EXg7sAR4FLl2FzJKkDWBiAVXVxUd56ZwjbFvA5csNJUna+HwSgiSphQUkSWphAUmSWlhAkqQWFpAkqYUFJElqYQFJklpYQJKkFhaQJKmFBSRJamEBSZJaWECSpBYWkCSphQUkSWphAUmSWlhAkqQWFpAkqYUFJElqYQFJklpYQJKkFhaQJKmFBSRJamEBSZJaWECSpBYWkCSphQUkSWphAUmSWlhAkqQWFpAkqYUFJElqYQFJklpYQJKkFhaQJKmFBSRJamEBSZJaWECSpBYWkCSpxabl7JzkAeD7wI+Bg1V1WpJnAv8AnAg8AFxYVd9dXkxJ0kazEldAv15Vp1TVacP6FcDNVTUD3DysS5L0E1bjFtz5wM5heSdwwSocQ5I05VJVS985uR/4LlDA+6rq/UkeqarjhtcDfPfQ+iGzs7OHDzoajZZ8/NN3b17yvivptrMe7Y4gSevCzMzM4eUtW7Zkvm2X9RkQcFZV7Uvys8BNSb4y/mJVVZJ5G2487KLt3rf0fVfQYs5hNBot75wbTFvmacsLZl4rZl5flnULrqr2Dd8PADcAZwAPJ9kGMHw/sNyQkqSNZ8kFlOTYJE8/tAy8FLgb2AVcMmx2CfDJ5YaUJG08y7kFtxW4Ye5jHjYBH6mqTye5DbguyWXAg8CFy48pSdpollxAVfU14PlHGP8v4JzlhJIkbXw+CUGS1MICkiS1sIAkSS0sIElSCwtIktTCApIktbCAJEktLCBJUgsLSJLUwgKSJLWwgCRJLSwgSVILC0iS1MICkiS1sIAkSS0sIElSCwtIktTCApIktbCAJEktLCBJUgsLSJLUwgKSJLWwgCRJLSwgSVILC0iS1MICkiS1sIAkSS0sIElSCwtIktTCApIktbCAJEktLCBJUgsLSJLUwgKSJLXY1B1AK+e4D+xbpXfeDLsX/t6PXLp9lXJI2ki8ApIktfAKaAUs7spjcVcTkrRRrVoBJTkX+BvgGODKqnrHah1L68vq3QpcqLmS91agtL6tSgElOQb4O+AlwF7gtiS7qure1TiepIVZ2R8Oln417w8HAkhVrfybJr8KvK2qXjasvxmgqv4CYHZ2duUPKklaV7Zs2ZL5Xl+tSQjbgYfG1vcOY5IkAc6CkyQ1Wa1JCPuAE8bWdwxjwOTLMknSxrdaV0C3ATNJTkryJOAiYNcqHUuSNIVWpYCq6iDweuCfgfuA66rqnpV6/yTnJvlqkj1Jrlip911NSa5OciDJ3d1ZFiLJCUk+m+TeJPckeWN3pkmSPCXJF5J8ecj8Z92ZFirJMUm+lORT3VkWIskDSe5KckeSf+/OsxBJjktyfZKvJLlvmCy1biV57vDne+jre0ne1J1rPkn+cPi7d3eSa5M8Zd7tV2MW3Goapnj/B2NTvIGL1/sU7yQvBn4AfLCqntedZ5Ik24BtVXV7kqcDXwQuWM9/zkkCHFtVP0jyRGA38MaquqU52kRJ/gg4DXhGVb2iO88kSR4ATquqb3dnWagkO4HPV9WVw52ZzVX1SHeuhRj+3dsH/EpVPdid50iSbGfu79zJVfU/Sa4Dbqyqa462zzROQjgD2FNVX6uqHwEfBc5vzjRRVX0O+E53joWqqv1Vdfuw/H3mrmTX9UzGmvODYfWJw9e6/wkryQ7gt4Aru7NsVEm2AC8GrgKoqh9NS/kMzgH+c72Wz5hNwFOTbAI2A9+Yb+NpLCCneK+xJCcCLwBu7U0y2XAr6w7gAHBTVa37zMBfA38K/F93kEUo4DNJvpjktd1hFuAk4FvAB4ZbnVcmObY71CJcBFzbHWI+VbUPeCfwdWA/MFtVn5lvn2ksIK2hJE8DPg68qaq+151nkqr6cVWdwtzMyzOSrOvbnUleARyoqi92Z1mks6rqVOA84PLhFvN6tgk4FXhvVb0A+G9gWj4/fhLwSuBj3Vnmk+R45u5GnQQ8Gzg2ye/Nt880FtC8U7y1cobPUT4OfLiqPtGdZzGG2yufBc7tzjLBi4BXDp+pfBT4jSQf6o002fDTLlV1ALiBuVvj69leYO/YFfH1zBXSNDgPuL2qHu4OMsFvAvdX1beq6n+BTwAvnG+HaSwgp3ivgeED/auA+6rqXd15FiLJzyQ5blh+KnMTVb7Sm2p+VfXmqtpRVScy99/yv1TVvD81dkty7DAxheE21kuBdT27s6q+CTyU5LnD0DnAup1Q8xgXs85vvw2+DpyZZPPw78c5zH12fFRT9+sYqupgkkNTvI8Brl7JKd6rJcm1wNnAs5LsBd5aVVf1pprXi4BXA3cNn6kAvKWqbmzMNMk2YOcwY+gJzE3/n4ppzVNmK3DD3L8xbAI+UlWf7o20IG8APjz84Po14NLmPBMNBf8S4HXdWSapqluTXA/cDhwEvgS8f759pm4atiRpY5jGW3CSpA3AApIktbCAJEktLCBJUgsLSJIeZxbzcOQkL05ye5KDSX5nbPznhvE7hgeQ/sGiczgLTpIeXxbzcOThUVzPAP4E2FVV1w/jT2KuQ344PDHlbuCFVTXv89/GeQUkSY8zR3o4cpLnJPn08Hy/zyf5+WHbB6rqTh7zrMLhga4/HFafzBL6xAKSJMHc/zT6hqr6Zeaudt4zaYfh94bdydwDov9yMVc/MIVPQpAkrazhFtoLgY8NT7iAuauaeVXVQ8AvJXk28I9Jrl/MM+ssIEnSE4BHhifJL1pVfWOY0PBrzD3odcEHlSQ9jg2/auX+JK+CuYcRJ3n+fPsk2TE89PfQr2I4C/jqYo5rAUnS48zwcOR/A56bZG+Sy4DfBS5L8mXgHobfNJ3k9OEByq8C3pfk0MOffwG4ddj+X4F3VtVdi8rhNGxJUgevgCRJLSwgSVILC0iS1MICkiS1sIAkSS0sIElSCwtIktTCApIktfh/ATIiA2z/FR0AAAAASUVORK5CYII=\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "gdp.hist();" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now we have had a look at the GDP values, we will look at the\n", "values for the `mat_mort_ratio` column. These are the numbers\n", "of women who die in childbirth for every 100,000 births." ] }, { "cell_type": "code", "execution_count": 17, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "0 444.00\n", "1 29.25\n", "2 142.50\n", "3 NaN\n", "4 NaN\n", "5 501.25\n", "6 NaN\n", "7 161.00\n", "8 53.75\n", "9 27.25\n", "10 NaN\n", "11 6.00\n", "12 4.00\n", "13 25.25\n", "14 81.50\n", "15 15.25\n", "16 194.75\n", "17 28.00\n", "18 4.00\n", "19 7.00\n", "20 29.25\n", "21 417.50\n", "22 NaN\n", "23 161.75\n", "24 218.25\n", "25 11.75\n", "26 138.75\n", "27 49.50\n", "28 NaN\n", "29 23.75\n", " ... \n", "233 62.00\n", "234 33.25\n", "235 429.50\n", "236 21.00\n", "237 240.25\n", "238 380.75\n", "239 129.25\n", "240 63.25\n", "241 63.25\n", "242 17.50\n", "243 43.50\n", "244 NaN\n", "245 NaN\n", "246 366.50\n", "247 24.25\n", "248 6.00\n", "249 9.25\n", "250 14.00\n", "251 43.25\n", "252 15.50\n", "253 37.00\n", "254 82.50\n", "255 97.00\n", "256 54.75\n", "257 NaN\n", "258 47.50\n", "259 223.75\n", "260 399.75\n", "261 233.75\n", "262 398.00\n", "Name: mat_mort_ratio, Length: 263, dtype: float64" ] }, "execution_count": 17, "metadata": {}, "output_type": "execute_result" } ], "source": [ "mmr = gender_data['mat_mort_ratio']\n", "mmr" ] }, { "cell_type": "code", "execution_count": 18, "metadata": {}, "outputs": [ { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAAZwAAAD1CAYAAABkzUMfAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDMuMC4yLCBodHRwOi8vbWF0cGxvdGxpYi5vcmcvOIA7rQAAEldJREFUeJzt3X+M5Hd93/Hnq77Y5JJ2z8YKvfpOuiNZUTmoCRYgW1gRwikYF2FHQtQWCofjKmpDUxqQjA1SSP9AhSYKMVILSTFwqRwHxzi1ZYUS5+IEWQoXxwT/xpnDgH0nm6PFt61iKYmbd/+Yzx3D9s53NzP7mfnuPR/Sar/fz/c7M6/vd3fntd/vfHc2VYUkSRvtHyw6gCTpzGDhSJK6sHAkSV1YOJKkLiwcSVIXWxbxoGtra14aJ0mb3MrKSibnPcKRJHVh4UiSuhhs4YxGo0VHmIn5F2fI2cH8izTk7LD4/IMtHEnSsFg4kqQuLBxJUhcWjiSpCwtHktSFhSNJ6sLCkSR1sZC3tpmXbZ85tOgIABy59oJFR5CkpecRjiSpCwtHktSFhSNJ6sLCkSR1cdLCSfLpJIeTPHKcZe9LUknOb/NJ8vEkB5I8lOSijQgtSRqeUznC+Sxw+frBJDuBNwJPTQy/GVhtHz8PfGL2iJKkzeCkhVNVXwK+e5xFHwOuByb/e+eVwG/X2JeBbUm2zyWpJGnQpnoNJ8mVwKGqenDdoguApyfmD7YxSdIZ7rT/8DPJVuADjE+nzWzR/xBoHqbdhqFv+5DzDzk7mH+RhpwdNj7/6urqCZdN804DPwrsBh5MArAD+EqS1wKHgJ0T6+5oY1OFezHL9EWfZhtGo9HU274Mhpx/yNnB/Is05Oyw+PynfUqtqh6uqh+pql1VtYvxabOLqupZ4C7gne1qtYuBtap6Zr6RJUlDdCqXRd8K/BnwiiQHk1z3Iqv/AfAkcAD4r8AvzCWlJGnwTnpKraquOcnyXRPTBbx79liSpM3GdxqQJHVh4UiSurBwJEldWDiSpC4sHElSFxaOJKkLC0eS1IWFI0nqwsKRJHVh4UiSurBwJEldWDiSpC4sHElSFxaOJKkLC0eS1IWFI0nqwsKRJHVh4UiSurBwJEldnLRwknw6yeEkj0yM/WqSryV5KMnvJ9k2sezGJAeSPJHkTRsVXJI0LKdyhPNZ4PJ1Y/cAr6yqfwb8FXAjQJILgauBH2+3+S9JzppbWknSYJ20cKrqS8B31439YVW90Ga/DOxo01cCv1tVf1NV3wAOAK+dY15J0kDN4zWcnwO+0KYvAJ6eWHawjUmSznBbZrlxkg8CLwC3THsfo9FolghLYdptGPq2Dzn/kLOD+RdpyNlh4/Ovrq6ecNnUhZPkXcBbgMuqqtrwIWDnxGo72thU4V7MMn3Rp9mG0Wg09bYvgyHnH3J2MP8iDTk7LD7/VKfUklwOXA+8taqen1h0F3B1knOS7AZWgT+fPaYkaehOeoST5Fbg9cD5SQ4CH2J8Vdo5wD1JAL5cVf+6qh5NchvwGONTbe+uqv+7UeElScNx0sKpqmuOM3zzi6z/YeDDs4SSJG0+vtOAJKkLC0eS1IWFI0nqwsKRJHVh4UiSurBwJEldWDiSpC4sHElSFxaOJKkLC0eS1IWFI0nqwsKRJHVh4UiSurBwJEldWDiSpC4sHElSFxaOJKkLC0eS1IWFI0nqwsKRJHVx0sJJ8ukkh5M8MjF2XpJ7koza53PbeJJ8PMmBJA8luWgjw0uShuNUjnA+C1y+buwGYF9VrQL72jzAm4HV9vHzwCfmE1OSNHQnLZyq+hLw3XXDVwJ72/Re4KqJ8d+usS8D25Jsn1dYSdJwpapOvlKyC7i7ql7Z5o9U1bY2HeC5qtqW5G7gI1V1X1u2D3h/Vf3F5P2tra0de9DRaDR1+Nfct3Xq287T/Zc+v+gIkrQUVldXj02vrKxkctmWWe+8qirJyVvrBCbDnY5ZimreptmG0Wg09bYvgyHnH3J2MP8iDTk7LD7/tFepffvoqbL2+XAbPwTsnFhvRxuTJJ3hpi2cu4A9bXoPcOfE+Dvb1WoXA2tV9cyMGSVJm8BJT6kluRV4PXB+koPAh4CPALcluQ74FvD2tvofAFcAB4DngWs3ILMkaYBOWjhVdc0JFl12nHULePesoSRJm4/vNCBJ6sLCkSR1YeFIkrqwcCRJXVg4kqQuLBxJUhcWjiSpCwtHktSFhSNJ6sLCkSR1YeFIkrqwcCRJXVg4kqQuLBxJUhcWjiSpCwtHktSFhSNJ6sLCkSR1YeFIkrqYqXCS/FKSR5M8kuTWJC9JsjvJ/iQHknwuydnzCitJGq6pCyfJBcC/A15dVa8EzgKuBj4KfKyqfgx4DrhuHkElScM26ym1LcAPJtkCbAWeAd4A3N6W7wWumvExJEmbwNSFU1WHgF8DnmJcNGvAA8CRqnqhrXYQuGDWkJKk4UtVTXfD5Fzg88C/BI4Av8f4yOZX2uk0kuwEvtBOuR2ztrZ27EFHo9F0yYHX3Ld16tvO0/2XPr/oCJK0FFZXV49Nr6ysZHLZlhnu96eBb1TVdwCS3AG8DtiWZEs7ytkBHDrVcKdjlqKat2m2YTQaTb3ty2DI+YecHcy/SEPODovPP8trOE8BFyfZmiTAZcBjwL3A29o6e4A7Z4soSdoMZnkNZz/jU2hfAR5u9/VbwPuB9yY5ALwUuHkOOSVJAzfLKTWq6kPAh9YNPwm8dpb7lSRtPr7TgCSpCwtHktSFhSNJ6sLCkSR1YeFIkrqwcCRJXVg4kqQuLBxJUhcWjiSpCwtHktSFhSNJ6sLCkSR1YeFIkrqwcCRJXVg4kqQuLBxJUhcWjiSpCwtHktSFhSNJ6sLCkSR1MVPhJNmW5PYkX0vyeJJLkpyX5J4ko/b53HmFlSQN16xHODcB/6Oq/inwE8DjwA3AvqpaBfa1eUnSGW7qwkmyAvwUcDNAVf1tVR0BrgT2ttX2AlfNGlKSNHypqulumPwk8FvAY4yPbh4A3gMcqqptbZ0Azx2dP2ptbe3Yg45Go+mSA6+5b+vUt52n+y99ftERJGkprK6uHpteWVnJ5LItM9zvFuAi4Beran+Sm1h3+qyqKsmLNtpkuNMxS1HN2zTbMBqNpt72ZTDk/EPODuZfpCFnh8Xnn+U1nIPAwara3+ZvZ1xA306yHaB9PjxbREnSZjB14VTVs8DTSV7Rhi5jfHrtLmBPG9sD3DlTQknSpjDLKTWAXwRuSXI28CRwLeMSuy3JdcC3gLfP+BiSpE1gpsKpqq8Crz7OostmuV9J0ubjOw1IkrqwcCRJXVg4kqQuLBxJUhcWjiSpCwtHktSFhSNJ6sLCkSR1YeFIkrqwcCRJXVg4kqQuLBxJUhcWjiSpCwtHktSFhSNJ6sLCkSR1YeFIkrqwcCRJXVg4kqQuZi6cJGcl+cskd7f53Un2JzmQ5HNJzp49piRp6OZxhPMe4PGJ+Y8CH6uqHwOeA66bw2NIkgZupsJJsgP4F8Cn2nyANwC3t1X2AlfN8hiSpM1h1iOc3wCuB/6+zb8UOFJVL7T5g8AFMz6GJGkT2DLtDZO8BThcVQ8kef209zMajaa96dKYdhuGvu1Dzj/k7GD+RRpydtj4/KurqydcNnXhAK8D3prkCuAlwD8CbgK2JdnSjnJ2AIemDfdilumLPs02jEajqbd9GQw5/5Czg/kXacjZYfH5pz6lVlU3VtWOqtoFXA38cVW9A7gXeFtbbQ9w58wpJUmDtxF/h/N+4L1JDjB+TefmDXgMSdLAzHJK7Ziq+hPgT9r0k8Br53G/kqTNw3cakCR1YeFIkrqwcCRJXVg4kqQuLBxJUhcWjiSpCwtHktSFhSNJ6sLCkSR1YeFIkrqwcCRJXVg4kqQu5vLmnWe6bZ950X/5cwJb4b5pbndiR671n6tKWl4e4UiSurBwJEldWDiSpC4sHElSFxaOJKkLC0eS1MXUhZNkZ5J7kzyW5NEk72nj5yW5J8mofT53fnElSUM1yxHOC8D7qupC4GLg3UkuBG4A9lXVKrCvzUuSznBTF05VPVNVX2nT/wd4HLgAuBLY21bbC1w1a0hJ0vDN5TWcJLuAVwH7gZdV1TNt0bPAy+bxGJKkYUtVzXYHyQ8Dfwp8uKruSHKkqrZNLH+uqr7vdZy1tbVjDzoajaZ+7Nfct3Xq225G91/6/KIjSDrDra6uHpteWVnJ5LKZ3kstyQ8Anwduqao72vC3k2yvqmeSbAcOn2q40zFLUW1Wy1LAy/6ebqPRaOrvu2Vg/sUZcnZYfP5ZrlILcDPweFX9+sSiu4A9bXoPcOf08SRJm8UsRzivA34WeDjJV9vYB4CPALcluQ74FvD22SJqaKZ79+z5W/YjLelMM3XhVNV9QE6w+LJp71eStDn5TgOSpC4sHElSFxaOJKkLC0eS1IWFI0nqwsKRJHVh4UiSurBwJEldWDiSpC4sHElSFxaOJKkLC0eS1MVM/w9HWmYnftfqrXBf33e09p2rJQtH6mK+/7Jh+sK0+LRInlKTJHVh4UiSurBwJEldWDiSpC4sHElSFxaOJKmLDbssOsnlwE3AWcCnquojG/VYkk7NfC/PntZWjqwuOoMWYUOOcJKcBfxn4M3AhcA1SS7ciMeSJA1Dqmr+d5pcAvxKVb2pzd8IUFX/EWBtbW3+DypJWiorKyuZnN+o13AuAJ6emD/YxiRJZygvGpAkdbFRFw0cAnZOzO9oY8D/f5glSdr8NuoI535gNcnuJGcDVwN3bdBjSZIGYEMKp6peAP4t8EXgceC2qnp0Xvef5PIkTyQ5kOSGed3vvCTZmeTeJI8leTTJe9r4eUnuSTJqn89t40ny8bY9DyW5aLFbMJbkrCR/meTuNr87yf6W83PtlwmSnNPmD7TluxaZu2XaluT2JF9L8niSS4ay/5P8Uvu+eSTJrUlessz7PsmnkxxO8sjE2Gnv6yR72vqjJHsWnP9X2/fOQ0l+P8m2iWU3tvxPJHnTxPhCnpeOl39i2fuSVJLz2/xi939VDeqD8d/1fB14OXA28CBw4aJzrcu4HbioTf9D4K8YXx7+n4Ab2vgNwEfb9BXAF4AAFwP7F70NLdd7gd8B7m7ztwFXt+lPAv+mTf8C8Mk2fTXwuSXIvhf4V236bGDbEPY/44trvgH84MQ+f9cy73vgp4CLgEcmxk5rXwPnAU+2z+e26XMXmP+NwJY2/dGJ/Be255xzgN3tueisRT4vHS9/G9/J+Jf+bwHnL8P+X8gP1Yw79xLgixPzNwI3LjrXSTLfCfxz4AlgexvbDjzRpn8TuGZi/WPrLTDzDmAf8Abg7vYN+j8nfgiPfR3aN/UlbXpLWy8LzL7SnrSzbnzp9z/fu8LzvLYv7wbetOz7Hti17gn7tPY1cA3wmxPj37de7/zrlv0McEub/r7nm6P7f9HPS8fLD9wO/ATwTb5XOAvd/0O8Sm1Ql1y3UxyvAvYDL6uqZ9qiZ4GXtell3KbfAK4H/r7NvxQ4UuPTpfD9GY/lb8vX2vqLshv4DvCZdkrwU0l+iAHs/6o6BPwa8BTwDON9+QDD2fdHne6+XpqvwXH8HOOjAhhI/iRXAoeq6sF1ixaaf4iFMxhJfhj4PPDvq+p/Ty6r8a8RS/kHsEneAhyuqgcWnWVKWxifYvhEVb0K+GvGp3WOWdb9317ruJJxaf4T4IeAyxcaakbLuq9PRZIPAi8Atyw6y6lKshX4APDLi86y3hAL50UvuV4WSX6AcdncUlV3tOFvJ9nelm8HDrfxZdum1wFvTfJN4HcZn1a7CdiW5Oil9JMZj+Vvy1eA/9Uz8DoHgYNVtb/N3864gIaw/38a+EZVfaeq/g64g/HXYyj7/qjT3dfL9DUAIMm7gLcA72ilCcPI/6OMf2F5sP0M7wC+kuQfs+D8Qyycpb/kOkmAm4HHq+rXJxbdBRy9+mMP49d2jo6/s11BcjGwNnE6oruqurGqdlTVLsb794+r6h3AvcDb2mrr8x/drre19Rf2G21VPQs8neQVbegy4DGGsf+fAi5OsrV9Hx3NPoh9P+F09/UXgTcmObcd5b2xjS1Exm8+fD3w1qp6fmLRXcDV7erA3cAq8Ocs0fNSVT1cVT9SVbvaz/BBxhcxPcui93+vF7Xm/ALZFYyv/Po68MFF5zlOvksZn0J4CPhq+7iC8bn1fcAI+CPgvLZ+GL/Z6deBh4FXL3obJrbl9XzvKrWXM/7hOgD8HnBOG39Jmz/Qlr98CXL/JPAX7Wvw3xlfeTOI/Q/8B+BrwCPAf2N8RdTS7nvgVsavN/0d4ye366bZ14xfKznQPq5dcP4DjF/TOPrz+8mJ9T/Y8j8BvHlifCHPS8fLv275N/neRQML3f8b8uadkiStN8RTapKkAbJwJEldWDiSpC4sHElSFxaOJKkLC0eS1IWFI0nqwsKRJHXx/wCHKTiER81FaAAAAABJRU5ErkJggg==\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "mmr.hist();" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We are interested in the relationship of `gpp` and `mmr`. Maybe richer countries have better health care, and fewer maternal deaths.\n", "\n", "Here is a plot, using the standard Matplotlib `scatter`\n", "function." ] }, { "cell_type": "code", "execution_count": 19, "metadata": {}, "outputs": [ { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "plt.scatter(gdp, mmr);" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can do the same plot using the `plot.scatter` method on the data frame. In that case, we specify the column names that should go on the x and the y axes." ] }, { "cell_type": "code", "execution_count": 20, "metadata": {}, "outputs": [ { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "gender_data.plot.scatter('gdp', 'mat_mort_ratio');" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "An advantage of doing it this way is that we get the column names on the x and y axes by default." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Showing the top 5 values with the `head` method" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We have already seen that Pandas will display the data frame with nice formatting. If the data frame is long, it will display only the first few and the last few rows:" ] }, { "cell_type": "code", "execution_count": 21, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
countryfert_rategdphealth_exp_per_caphealth_exp_pubprim_ed_girlsmat_mort_ratiopopulation
0Afghanistan4.9545001.996102e+10161.1380342.83459840.109708444.003.271584e+07
1Albania1.7692501.232759e+10574.2026942.83602147.20108229.252.888280e+06
2Algeria2.8660001.907346e+11870.7665084.98425247.675617142.503.909906e+07
3American SamoaNaN6.405000e+08NaNNaNNaNNaN5.542200e+04
4AndorraNaN3.197538e+094421.2249337.26028147.123345NaN7.954740e+04
5Angola6.1230001.119365e+11254.7479702.447546NaN501.252.693754e+07
6Antigua and Barbuda2.0820001.298213e+091152.4936563.67651448.291463NaN9.887240e+04
7Arab World3.3975872.709059e+12761.4017272.87384047.119776161.003.899620e+08
8Argentina2.3280005.509810e+111148.2561422.78221648.91581053.754.297667e+07
9Armenia1.5455001.088536e+10348.6638841.91601646.78218027.252.904683e+06
10Aruba1.663250NaNNaNNaN48.721939NaN1.037444e+05
11Australia1.8615001.422994e+124256.0589886.29238148.5767076.002.344456e+07
12Austria1.4550004.074943e+114930.2988938.50427648.5560784.008.566294e+06
13Azerbaijan1.9800006.200300e+10956.7097181.19724946.15736325.259.531856e+06
14Bahamas, The1.8772508.688000e+091727.1283853.308626NaN81.503.819036e+05
15Bahrain2.0652503.200401e+102030.1583162.97638649.11683815.251.349810e+06
16Bangladesh2.1932501.745451e+1185.9688440.86044750.460564194.751.593712e+08
17Barbados1.7922504.413080e+091062.8400884.82868048.87818128.002.833384e+05
18Belarus1.6770006.478294e+10986.2367573.87660148.6857414.009.480348e+06
19Belgium1.7550004.942218e+114297.8380058.22100348.8646757.001.122850e+07
20Belize2.5947501.680325e+09471.9674653.74484448.31723829.253.517636e+05
21Benin4.8067508.778151e+0983.7261902.20691647.211127417.501.029371e+07
22Bermuda1.6175005.555624e+09NaNNaN48.423588NaN6.510080e+04
23Bhutan2.0612501.975145e+09277.5266702.70690849.572296161.757.759054e+05
24Bolivia2.9952503.150932e+10381.0075944.19203148.464175218.251.056280e+07
25Bosnia and Herzegovina1.2670001.732333e+10941.5046556.84102148.63490511.753.574396e+06
26Botswana2.8450001.511339e+10880.9092023.55207148.844009138.752.169170e+06
27Brazil1.7952502.198766e+121303.1991043.77347347.78457749.502.041595e+08
28British Virgin IslandsNaNNaNNaNNaN47.581520NaN2.958540e+04
29Brunei Darussalam1.8840001.571922e+101795.9241602.33519448.52369923.754.115812e+05
...........................
233Syrian Arab Republic2.967750NaN269.9457391.50716648.04739462.001.931967e+07
234Tajikistan3.4957508.036228e+09169.7459701.97636748.26068033.258.363844e+06
235Tanzania5.1812504.493554e+10131.7041622.64860950.666580429.505.228132e+07
236Thailand1.5167504.061369e+11581.9274873.18384248.21303421.006.838499e+07
237Timor-Leste5.7977501.361430e+0998.5772961.14044048.337367240.251.212718e+06
238Togo4.6200004.183610e+0971.2638252.03780948.270471380.757.230904e+06
239Tonga3.7457504.391789e+08250.9625043.98728547.697931129.251.059094e+05
240Trinidad and Tobago1.7827502.457095e+101778.1480733.071370NaN63.251.353877e+06
241Tunisia2.1400004.482437e+10782.9505224.11877148.14213263.251.114441e+07
242Turkey2.0780008.951756e+11997.3747724.18952148.78947717.507.703435e+07
243Turkmenistan2.3137503.797310e+10288.5726441.34930348.90687943.505.465637e+06
244Turks and Caicos IslandsNaNNaNNaNNaN48.846884NaN3.370340e+04
245TuvaluNaN3.646999e+07563.50059215.50692947.472414NaN1.091000e+04
246Uganda5.8225002.594146e+10132.8926842.01434950.099485366.503.886534e+07
247Ukraine1.5102501.353793e+11628.5792543.96018548.98419824.254.530270e+07
248United Arab Emirates1.7930003.750271e+112202.4075692.58116848.7892606.009.080299e+06
249United Kingdom1.8425002.768864e+123357.9836757.72065548.7918099.256.464156e+07
250United States1.8608751.736912e+139060.0686578.12196148.75883014.003.185582e+08
251Upper middle income1.7952442.097441e+13870.8975123.35815347.11200143.252.540966e+09
252Uruguay2.0270005.434513e+101721.5077526.04440348.29555515.503.419977e+06
253Uzbekistan2.3727506.134065e+10334.4767543.11884248.38743437.003.078450e+07
254Vanuatu3.3647507.828760e+08125.5687123.68987447.30161782.502.588964e+05
255Venezuela, RB2.3782503.761463e+11896.8153141.58708848.40093497.003.073452e+07
256Vietnam1.9595001.818207e+11368.3745503.77950148.02105354.759.074240e+07
257Virgin Islands (U.S.)1.7600003.812000e+09NaNNaNNaNNaN1.041414e+05
258West Bank and Gaza4.2080001.250822e+10NaNNaN48.82852047.504.296960e+06
259World2.4642827.613006e+131223.9412435.94705848.076575223.757.269321e+09
260Yemen, Rep.4.2257503.681934e+10207.9497001.41783644.470076399.752.624661e+07
261Zambia5.3942502.428099e+10185.5563592.68729049.934484233.751.563322e+07
262Zimbabwe3.9430001.549551e+10115.5198812.69518849.529875398.001.542096e+07
\n", "

263 rows × 8 columns

\n", "
" ], "text/plain": [ " country fert_rate gdp health_exp_per_cap \\\n", "0 Afghanistan 4.954500 1.996102e+10 161.138034 \n", "1 Albania 1.769250 1.232759e+10 574.202694 \n", "2 Algeria 2.866000 1.907346e+11 870.766508 \n", "3 American Samoa NaN 6.405000e+08 NaN \n", "4 Andorra NaN 3.197538e+09 4421.224933 \n", "5 Angola 6.123000 1.119365e+11 254.747970 \n", "6 Antigua and Barbuda 2.082000 1.298213e+09 1152.493656 \n", "7 Arab World 3.397587 2.709059e+12 761.401727 \n", "8 Argentina 2.328000 5.509810e+11 1148.256142 \n", "9 Armenia 1.545500 1.088536e+10 348.663884 \n", "10 Aruba 1.663250 NaN NaN \n", "11 Australia 1.861500 1.422994e+12 4256.058988 \n", "12 Austria 1.455000 4.074943e+11 4930.298893 \n", "13 Azerbaijan 1.980000 6.200300e+10 956.709718 \n", "14 Bahamas, The 1.877250 8.688000e+09 1727.128385 \n", "15 Bahrain 2.065250 3.200401e+10 2030.158316 \n", "16 Bangladesh 2.193250 1.745451e+11 85.968844 \n", "17 Barbados 1.792250 4.413080e+09 1062.840088 \n", "18 Belarus 1.677000 6.478294e+10 986.236757 \n", "19 Belgium 1.755000 4.942218e+11 4297.838005 \n", "20 Belize 2.594750 1.680325e+09 471.967465 \n", "21 Benin 4.806750 8.778151e+09 83.726190 \n", "22 Bermuda 1.617500 5.555624e+09 NaN \n", "23 Bhutan 2.061250 1.975145e+09 277.526670 \n", "24 Bolivia 2.995250 3.150932e+10 381.007594 \n", "25 Bosnia and Herzegovina 1.267000 1.732333e+10 941.504655 \n", "26 Botswana 2.845000 1.511339e+10 880.909202 \n", "27 Brazil 1.795250 2.198766e+12 1303.199104 \n", "28 British Virgin Islands NaN NaN NaN \n", "29 Brunei Darussalam 1.884000 1.571922e+10 1795.924160 \n", ".. ... ... ... ... \n", "233 Syrian Arab Republic 2.967750 NaN 269.945739 \n", "234 Tajikistan 3.495750 8.036228e+09 169.745970 \n", "235 Tanzania 5.181250 4.493554e+10 131.704162 \n", "236 Thailand 1.516750 4.061369e+11 581.927487 \n", "237 Timor-Leste 5.797750 1.361430e+09 98.577296 \n", "238 Togo 4.620000 4.183610e+09 71.263825 \n", "239 Tonga 3.745750 4.391789e+08 250.962504 \n", "240 Trinidad and Tobago 1.782750 2.457095e+10 1778.148073 \n", "241 Tunisia 2.140000 4.482437e+10 782.950522 \n", "242 Turkey 2.078000 8.951756e+11 997.374772 \n", "243 Turkmenistan 2.313750 3.797310e+10 288.572644 \n", "244 Turks and Caicos Islands NaN NaN NaN \n", "245 Tuvalu NaN 3.646999e+07 563.500592 \n", "246 Uganda 5.822500 2.594146e+10 132.892684 \n", "247 Ukraine 1.510250 1.353793e+11 628.579254 \n", "248 United Arab Emirates 1.793000 3.750271e+11 2202.407569 \n", "249 United Kingdom 1.842500 2.768864e+12 3357.983675 \n", "250 United States 1.860875 1.736912e+13 9060.068657 \n", "251 Upper middle income 1.795244 2.097441e+13 870.897512 \n", "252 Uruguay 2.027000 5.434513e+10 1721.507752 \n", "253 Uzbekistan 2.372750 6.134065e+10 334.476754 \n", "254 Vanuatu 3.364750 7.828760e+08 125.568712 \n", "255 Venezuela, RB 2.378250 3.761463e+11 896.815314 \n", "256 Vietnam 1.959500 1.818207e+11 368.374550 \n", "257 Virgin Islands (U.S.) 1.760000 3.812000e+09 NaN \n", "258 West Bank and Gaza 4.208000 1.250822e+10 NaN \n", "259 World 2.464282 7.613006e+13 1223.941243 \n", "260 Yemen, Rep. 4.225750 3.681934e+10 207.949700 \n", "261 Zambia 5.394250 2.428099e+10 185.556359 \n", "262 Zimbabwe 3.943000 1.549551e+10 115.519881 \n", "\n", " health_exp_pub prim_ed_girls mat_mort_ratio population \n", "0 2.834598 40.109708 444.00 3.271584e+07 \n", "1 2.836021 47.201082 29.25 2.888280e+06 \n", "2 4.984252 47.675617 142.50 3.909906e+07 \n", "3 NaN NaN NaN 5.542200e+04 \n", "4 7.260281 47.123345 NaN 7.954740e+04 \n", "5 2.447546 NaN 501.25 2.693754e+07 \n", "6 3.676514 48.291463 NaN 9.887240e+04 \n", "7 2.873840 47.119776 161.00 3.899620e+08 \n", "8 2.782216 48.915810 53.75 4.297667e+07 \n", "9 1.916016 46.782180 27.25 2.904683e+06 \n", "10 NaN 48.721939 NaN 1.037444e+05 \n", "11 6.292381 48.576707 6.00 2.344456e+07 \n", "12 8.504276 48.556078 4.00 8.566294e+06 \n", "13 1.197249 46.157363 25.25 9.531856e+06 \n", "14 3.308626 NaN 81.50 3.819036e+05 \n", "15 2.976386 49.116838 15.25 1.349810e+06 \n", "16 0.860447 50.460564 194.75 1.593712e+08 \n", "17 4.828680 48.878181 28.00 2.833384e+05 \n", "18 3.876601 48.685741 4.00 9.480348e+06 \n", "19 8.221003 48.864675 7.00 1.122850e+07 \n", "20 3.744844 48.317238 29.25 3.517636e+05 \n", "21 2.206916 47.211127 417.50 1.029371e+07 \n", "22 NaN 48.423588 NaN 6.510080e+04 \n", "23 2.706908 49.572296 161.75 7.759054e+05 \n", "24 4.192031 48.464175 218.25 1.056280e+07 \n", "25 6.841021 48.634905 11.75 3.574396e+06 \n", "26 3.552071 48.844009 138.75 2.169170e+06 \n", "27 3.773473 47.784577 49.50 2.041595e+08 \n", "28 NaN 47.581520 NaN 2.958540e+04 \n", "29 2.335194 48.523699 23.75 4.115812e+05 \n", ".. ... ... ... ... \n", "233 1.507166 48.047394 62.00 1.931967e+07 \n", "234 1.976367 48.260680 33.25 8.363844e+06 \n", "235 2.648609 50.666580 429.50 5.228132e+07 \n", "236 3.183842 48.213034 21.00 6.838499e+07 \n", "237 1.140440 48.337367 240.25 1.212718e+06 \n", "238 2.037809 48.270471 380.75 7.230904e+06 \n", "239 3.987285 47.697931 129.25 1.059094e+05 \n", "240 3.071370 NaN 63.25 1.353877e+06 \n", "241 4.118771 48.142132 63.25 1.114441e+07 \n", "242 4.189521 48.789477 17.50 7.703435e+07 \n", "243 1.349303 48.906879 43.50 5.465637e+06 \n", "244 NaN 48.846884 NaN 3.370340e+04 \n", "245 15.506929 47.472414 NaN 1.091000e+04 \n", "246 2.014349 50.099485 366.50 3.886534e+07 \n", "247 3.960185 48.984198 24.25 4.530270e+07 \n", "248 2.581168 48.789260 6.00 9.080299e+06 \n", "249 7.720655 48.791809 9.25 6.464156e+07 \n", "250 8.121961 48.758830 14.00 3.185582e+08 \n", "251 3.358153 47.112001 43.25 2.540966e+09 \n", "252 6.044403 48.295555 15.50 3.419977e+06 \n", "253 3.118842 48.387434 37.00 3.078450e+07 \n", "254 3.689874 47.301617 82.50 2.588964e+05 \n", "255 1.587088 48.400934 97.00 3.073452e+07 \n", "256 3.779501 48.021053 54.75 9.074240e+07 \n", "257 NaN NaN NaN 1.041414e+05 \n", "258 NaN 48.828520 47.50 4.296960e+06 \n", "259 5.947058 48.076575 223.75 7.269321e+09 \n", "260 1.417836 44.470076 399.75 2.624661e+07 \n", "261 2.687290 49.934484 233.75 1.563322e+07 \n", "262 2.695188 49.529875 398.00 1.542096e+07 \n", "\n", "[263 rows x 8 columns]" ] }, "execution_count": 21, "metadata": {}, "output_type": "execute_result" } ], "source": [ "gender_data" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Notice the `...` in the center of this listing, to show that it has not printed some rows." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Sometimes we do not want to see all these rows, but only - say - the top five rows. The `head` method of the data frame is a useful way to do this:" ] }, { "cell_type": "code", "execution_count": 22, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
countryfert_rategdphealth_exp_per_caphealth_exp_pubprim_ed_girlsmat_mort_ratiopopulation
0Afghanistan4.954501.996102e+10161.1380342.83459840.109708444.0032715838.4
1Albania1.769251.232759e+10574.2026942.83602147.20108229.252888280.2
2Algeria2.866001.907346e+11870.7665084.98425247.675617142.5039099060.4
3American SamoaNaN6.405000e+08NaNNaNNaNNaN55422.0
4AndorraNaN3.197538e+094421.2249337.26028147.123345NaN79547.4
\n", "
" ], "text/plain": [ " country fert_rate gdp health_exp_per_cap \\\n", "0 Afghanistan 4.95450 1.996102e+10 161.138034 \n", "1 Albania 1.76925 1.232759e+10 574.202694 \n", "2 Algeria 2.86600 1.907346e+11 870.766508 \n", "3 American Samoa NaN 6.405000e+08 NaN \n", "4 Andorra NaN 3.197538e+09 4421.224933 \n", "\n", " health_exp_pub prim_ed_girls mat_mort_ratio population \n", "0 2.834598 40.109708 444.00 32715838.4 \n", "1 2.836021 47.201082 29.25 2888280.2 \n", "2 4.984252 47.675617 142.50 39099060.4 \n", "3 NaN NaN NaN 55422.0 \n", "4 7.260281 47.123345 NaN 79547.4 " ] }, "execution_count": 22, "metadata": {}, "output_type": "execute_result" } ], "source": [ "gender_data.head()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The `Series` also has a `head` method, that does the same thing:" ] }, { "cell_type": "code", "execution_count": 23, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "0 1.996102e+10\n", "1 1.232759e+10\n", "2 1.907346e+11\n", "3 6.405000e+08\n", "4 3.197538e+09\n", "Name: gdp, dtype: float64" ] }, "execution_count": 23, "metadata": {}, "output_type": "execute_result" } ], "source": [ "gdp.head()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Selecting rows" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We often want to select rows from the data frame that match some criterion.\n", "\n", "Say we want to select the rows corresponding the countries with a high GDP.\n", "\n", "Looking at the histogram of `gdp` above, we could try this as a threshold to identify high GDP countries." ] }, { "cell_type": "code", "execution_count": 24, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "0 False\n", "1 False\n", "2 False\n", "3 False\n", "4 False\n", "5 False\n", "6 False\n", "7 False\n", "8 False\n", "9 False\n", "10 False\n", "11 False\n", "12 False\n", "13 False\n", "14 False\n", "15 False\n", "16 False\n", "17 False\n", "18 False\n", "19 False\n", "20 False\n", "21 False\n", "22 False\n", "23 False\n", "24 False\n", "25 False\n", "26 False\n", "27 False\n", "28 False\n", "29 False\n", " ... \n", "233 False\n", "234 False\n", "235 False\n", "236 False\n", "237 False\n", "238 False\n", "239 False\n", "240 False\n", "241 False\n", "242 False\n", "243 False\n", "244 False\n", "245 False\n", "246 False\n", "247 False\n", "248 False\n", "249 False\n", "250 True\n", "251 True\n", "252 False\n", "253 False\n", "254 False\n", "255 False\n", "256 False\n", "257 False\n", "258 False\n", "259 True\n", "260 False\n", "261 False\n", "262 False\n", "Name: gdp, Length: 263, dtype: bool" ] }, "execution_count": 24, "metadata": {}, "output_type": "execute_result" } ], "source": [ "high_gdp = gdp > 1e13\n", "high_gdp" ] }, { "cell_type": "code", "execution_count": 25, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "pandas.core.series.Series" ] }, "execution_count": 25, "metadata": {}, "output_type": "execute_result" } ], "source": [ "type(high_gdp)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Notice that `high_gdp` is a Boolean series, like the Boolean arrays you have already seen. It has `True` for elements corresponding to countries with `gdp` value greater than `1e13` and `False` otherwise.\n", "\n", "We can use this Boolean series to select rows from the data frame. The `loc` attribute of the data frame allows us to *LOCate* values in the data frame. For our Boolean series, it works like this:" ] }, { "cell_type": "code", "execution_count": 26, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
countryfert_rategdphealth_exp_per_caphealth_exp_pubprim_ed_girlsmat_mort_ratiopopulation
44China1.5587501.018279e+13657.7488593.01553046.29796428.751.364446e+09
60Early-demographic dividend2.6363761.019283e+13392.4282682.59596748.651143169.003.083697e+09
61East Asia & Pacific1.7814242.168128e+13835.9742594.68759647.21249063.002.265974e+09
62East Asia & Pacific (IDA & IBRD)1.8118501.239991e+13558.7111002.81557347.09803166.751.996942e+09
63East Asia & Pacific (excluding high income)1.8139501.242383e+13558.7023272.81549847.11517366.752.022090e+09
71Euro area1.5510041.255692e+133913.4663647.95608048.6100306.503.384615e+08
72Europe & Central Asia1.7380942.191519e+132518.5663237.13069448.65359916.759.032073e+08
75European Union1.5700121.731910e+133448.9102247.81662848.6587778.005.082110e+08
98High income1.6865854.884635e+135045.8850087.60202248.70103010.001.175934e+09
102IBRD only2.1031852.607726e+13625.3574283.10468248.026892106.254.607548e+09
103IDA & IBRD total2.5875572.803020e+13507.0442583.01368747.896337244.756.113147e+09
128Late-demographic dividend1.6678201.862014e+13838.2591993.33574647.04638336.002.235352e+09
140Low & middle income2.5952582.724634e+13498.1930612.982075NaN245.256.089148e+09
160Middle income2.3617462.691014e+13542.3409402.99648048.044140185.755.468296e+09
177North America1.8344041.908336e+138615.5354508.06618248.70868313.503.541404e+08
180OECD members1.7494184.787743e+134566.9593777.63619848.70436415.001.273149e+09
193Post-demographic dividend1.6364704.541806e+135124.2141627.85817048.64986310.751.092637e+09
250United States1.8608751.736912e+139060.0686578.12196148.75883014.003.185582e+08
251Upper middle income1.7952442.097441e+13870.8975123.35815347.11200143.252.540966e+09
259World2.4642827.613006e+131223.9412435.94705848.076575223.757.269321e+09
\n", "
" ], "text/plain": [ " country fert_rate gdp \\\n", "44 China 1.558750 1.018279e+13 \n", "60 Early-demographic dividend 2.636376 1.019283e+13 \n", "61 East Asia & Pacific 1.781424 2.168128e+13 \n", "62 East Asia & Pacific (IDA & IBRD) 1.811850 1.239991e+13 \n", "63 East Asia & Pacific (excluding high income) 1.813950 1.242383e+13 \n", "71 Euro area 1.551004 1.255692e+13 \n", "72 Europe & Central Asia 1.738094 2.191519e+13 \n", "75 European Union 1.570012 1.731910e+13 \n", "98 High income 1.686585 4.884635e+13 \n", "102 IBRD only 2.103185 2.607726e+13 \n", "103 IDA & IBRD total 2.587557 2.803020e+13 \n", "128 Late-demographic dividend 1.667820 1.862014e+13 \n", "140 Low & middle income 2.595258 2.724634e+13 \n", "160 Middle income 2.361746 2.691014e+13 \n", "177 North America 1.834404 1.908336e+13 \n", "180 OECD members 1.749418 4.787743e+13 \n", "193 Post-demographic dividend 1.636470 4.541806e+13 \n", "250 United States 1.860875 1.736912e+13 \n", "251 Upper middle income 1.795244 2.097441e+13 \n", "259 World 2.464282 7.613006e+13 \n", "\n", " health_exp_per_cap health_exp_pub prim_ed_girls mat_mort_ratio \\\n", "44 657.748859 3.015530 46.297964 28.75 \n", "60 392.428268 2.595967 48.651143 169.00 \n", "61 835.974259 4.687596 47.212490 63.00 \n", "62 558.711100 2.815573 47.098031 66.75 \n", "63 558.702327 2.815498 47.115173 66.75 \n", "71 3913.466364 7.956080 48.610030 6.50 \n", "72 2518.566323 7.130694 48.653599 16.75 \n", "75 3448.910224 7.816628 48.658777 8.00 \n", "98 5045.885008 7.602022 48.701030 10.00 \n", "102 625.357428 3.104682 48.026892 106.25 \n", "103 507.044258 3.013687 47.896337 244.75 \n", "128 838.259199 3.335746 47.046383 36.00 \n", "140 498.193061 2.982075 NaN 245.25 \n", "160 542.340940 2.996480 48.044140 185.75 \n", "177 8615.535450 8.066182 48.708683 13.50 \n", "180 4566.959377 7.636198 48.704364 15.00 \n", "193 5124.214162 7.858170 48.649863 10.75 \n", "250 9060.068657 8.121961 48.758830 14.00 \n", "251 870.897512 3.358153 47.112001 43.25 \n", "259 1223.941243 5.947058 48.076575 223.75 \n", "\n", " population \n", "44 1.364446e+09 \n", "60 3.083697e+09 \n", "61 2.265974e+09 \n", "62 1.996942e+09 \n", "63 2.022090e+09 \n", "71 3.384615e+08 \n", "72 9.032073e+08 \n", "75 5.082110e+08 \n", "98 1.175934e+09 \n", "102 4.607548e+09 \n", "103 6.113147e+09 \n", "128 2.235352e+09 \n", "140 6.089148e+09 \n", "160 5.468296e+09 \n", "177 3.541404e+08 \n", "180 1.273149e+09 \n", "193 1.092637e+09 \n", "250 3.185582e+08 \n", "251 2.540966e+09 \n", "259 7.269321e+09 " ] }, "execution_count": 26, "metadata": {}, "output_type": "execute_result" } ], "source": [ "rich_gender_data = gender_data.loc[high_gdp]\n", "rich_gender_data" ] }, { "cell_type": "code", "execution_count": 27, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "pandas.core.frame.DataFrame" ] }, "execution_count": 27, "metadata": {}, "output_type": "execute_result" } ], "source": [ "type(rich_gender_data)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "`rich_gender_data` is a new data frame, that is a subset of the\n", "original `gender_data` frame. It contains only the rows where\n", "the GDP value is greater than `1e13` dollars. Check the display\n", "of `rich_gender_data` above to confirm that the values in the\n", "`gdp` column are all greater than `1e13`.\n", "\n", "We can do a scatter plot of GDP values against maternal\n", "mortality rate, and we find, oddly, that for rich countries,\n", "there is little relationship between GDP and maternal mortality." ] }, { "cell_type": "code", "execution_count": 28, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "" ] }, "execution_count": 28, "metadata": {}, "output_type": "execute_result" }, { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "rich_gender_data.plot.scatter('gdp', 'mat_mort_ratio')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Sorting data frames" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Data frames have a method `sort_value`. This returns a new data frame with the rows sorted by the values in the column we specify.\n", "\n", "Here are the first five rows of the original data frame:" ] }, { "cell_type": "code", "execution_count": 29, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
countryfert_rategdphealth_exp_per_caphealth_exp_pubprim_ed_girlsmat_mort_ratiopopulation
0Afghanistan4.954501.996102e+10161.1380342.83459840.109708444.0032715838.4
1Albania1.769251.232759e+10574.2026942.83602147.20108229.252888280.2
2Algeria2.866001.907346e+11870.7665084.98425247.675617142.5039099060.4
3American SamoaNaN6.405000e+08NaNNaNNaNNaN55422.0
4AndorraNaN3.197538e+094421.2249337.26028147.123345NaN79547.4
\n", "
" ], "text/plain": [ " country fert_rate gdp health_exp_per_cap \\\n", "0 Afghanistan 4.95450 1.996102e+10 161.138034 \n", "1 Albania 1.76925 1.232759e+10 574.202694 \n", "2 Algeria 2.86600 1.907346e+11 870.766508 \n", "3 American Samoa NaN 6.405000e+08 NaN \n", "4 Andorra NaN 3.197538e+09 4421.224933 \n", "\n", " health_exp_pub prim_ed_girls mat_mort_ratio population \n", "0 2.834598 40.109708 444.00 32715838.4 \n", "1 2.836021 47.201082 29.25 2888280.2 \n", "2 4.984252 47.675617 142.50 39099060.4 \n", "3 NaN NaN NaN 55422.0 \n", "4 7.260281 47.123345 NaN 79547.4 " ] }, "execution_count": 29, "metadata": {}, "output_type": "execute_result" } ], "source": [ "gender_data.head()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can make a new data frame where the rows are sorted by the values in the `gdp` column:" ] }, { "cell_type": "code", "execution_count": 30, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
countryfert_rategdphealth_exp_per_caphealth_exp_pubprim_ed_girlsmat_mort_ratiopopulation
245TuvaluNaN3.646999e+07563.50059215.50692947.472414NaN10910.0
169NauruNaN1.063908e+08611.0208564.52793249.409439NaN11695.4
121Kiribati3.745251.774306e+08179.3723838.27918449.55725595.0110481.6
152Marshall IslandsNaN1.843189e+08657.05933514.30252948.397282NaN52882.8
185Palau2.220002.548400e+081389.9930776.62346046.425690NaN21112.6
\n", "
" ], "text/plain": [ " country fert_rate gdp health_exp_per_cap \\\n", "245 Tuvalu NaN 3.646999e+07 563.500592 \n", "169 Nauru NaN 1.063908e+08 611.020856 \n", "121 Kiribati 3.74525 1.774306e+08 179.372383 \n", "152 Marshall Islands NaN 1.843189e+08 657.059335 \n", "185 Palau 2.22000 2.548400e+08 1389.993077 \n", "\n", " health_exp_pub prim_ed_girls mat_mort_ratio population \n", "245 15.506929 47.472414 NaN 10910.0 \n", "169 4.527932 49.409439 NaN 11695.4 \n", "121 8.279184 49.557255 95.0 110481.6 \n", "152 14.302529 48.397282 NaN 52882.8 \n", "185 6.623460 46.425690 NaN 21112.6 " ] }, "execution_count": 30, "metadata": {}, "output_type": "execute_result" } ], "source": [ "gender_data_by_gdp = gender_data.sort_values('gdp')\n", "gender_data_by_gdp.head()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Notice that the rows are in *ascending* order of `gdp`. You can imagine, that we often want *descending* order. As usual you can explore how you might do this by looking at the help for the `sort_values` method with:\n", "\n", "```\n", "gender_data.sort_values?\n", "```\n", "\n", "in a new cell. If you do that, you discover the `ascending` argument, that you can use like this:" ] }, { "cell_type": "code", "execution_count": 31, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
countryfert_rategdphealth_exp_per_caphealth_exp_pubprim_ed_girlsmat_mort_ratiopopulation
259World2.4642827.613006e+131223.9412435.94705848.076575223.757.269321e+09
98High income1.6865854.884635e+135045.8850087.60202248.70103010.001.175934e+09
180OECD members1.7494184.787743e+134566.9593777.63619848.70436415.001.273149e+09
193Post-demographic dividend1.6364704.541806e+135124.2141627.85817048.64986310.751.092637e+09
103IDA & IBRD total2.5875572.803020e+13507.0442583.01368747.896337244.756.113147e+09
140Low & middle income2.5952582.724634e+13498.1930612.982075NaN245.256.089148e+09
160Middle income2.3617462.691014e+13542.3409402.99648048.044140185.755.468296e+09
102IBRD only2.1031852.607726e+13625.3574283.10468248.026892106.254.607548e+09
72Europe & Central Asia1.7380942.191519e+132518.5663237.13069448.65359916.759.032073e+08
61East Asia & Pacific1.7814242.168128e+13835.9742594.68759647.21249063.002.265974e+09
251Upper middle income1.7952442.097441e+13870.8975123.35815347.11200143.252.540966e+09
177North America1.8344041.908336e+138615.5354508.06618248.70868313.503.541404e+08
128Late-demographic dividend1.6678201.862014e+13838.2591993.33574647.04638336.002.235352e+09
250United States1.8608751.736912e+139060.0686578.12196148.75883014.003.185582e+08
75European Union1.5700121.731910e+133448.9102247.81662848.6587778.005.082110e+08
71Euro area1.5510041.255692e+133913.4663647.95608048.6100306.503.384615e+08
63East Asia & Pacific (excluding high income)1.8139501.242383e+13558.7023272.81549847.11517366.752.022090e+09
62East Asia & Pacific (IDA & IBRD)1.8118501.239991e+13558.7111002.81557347.09803166.751.996942e+09
60Early-demographic dividend2.6363761.019283e+13392.4282682.59596748.651143169.003.083697e+09
44China1.5587501.018279e+13657.7488593.01553046.29796428.751.364446e+09
142Lower middle income2.8774815.929275e+12254.6186421.65757848.632937262.502.927329e+09
129Latin America & Caribbean2.1251645.845138e+121069.1418653.65962148.37860470.756.242184e+08
130Latin America & Caribbean (IDA & IBRD)2.1381225.636800e+121050.0108973.58592448.38298071.506.080342e+08
131Latin America & Caribbean (excluding high income)2.1407425.377437e+121046.1162683.63897448.37904672.755.969302e+08
117Japan1.4300005.106025e+123687.1262798.49607448.7441995.751.272971e+08
73Europe & Central Asia (IDA & IBRD)1.8614404.213635e+121173.1357133.87839948.63203324.004.505588e+08
74Europe & Central Asia (excluding high income)1.9094663.710324e+121138.1691653.79486748.61973725.504.125489e+08
85Germany1.4500003.601226e+124909.6598848.54261548.5686956.258.128164e+07
157Middle East & North Africa2.8525313.380647e+12912.4915523.05566047.62127383.504.208876e+08
249United Kingdom1.8425002.768864e+123357.9836757.72065548.7918099.256.464156e+07
...........................
254Vanuatu3.3647507.828760e+08125.5687123.68987447.30161782.502.588964e+05
224St. Vincent and the Grenadines1.9860007.301068e+08775.8033864.36575748.53641545.751.094206e+05
3American SamoaNaN6.405000e+08NaNNaNNaNNaN5.542200e+04
46Comoros4.5240006.039190e+0899.2219382.28903947.515774349.507.595556e+05
58DominicaNaN5.130350e+08572.4137343.74985148.774225NaN7.278540e+04
239Tonga3.7457504.391789e+08250.9625043.98728547.697931129.251.059094e+05
156Micronesia, Fed. Sts.3.2692503.193208e+08456.40385112.03727748.084030103.251.041180e+05
202Sao Tome and Principe4.6037503.145400e+08304.2375333.22712548.664169159.501.913326e+05
185Palau2.2200002.548400e+081389.9930776.62346046.425690NaN2.111260e+04
152Marshall IslandsNaN1.843189e+08657.05933514.30252948.397282NaN5.288280e+04
121Kiribati3.7452501.774306e+08179.3723838.27918449.55725595.001.104816e+05
169NauruNaN1.063908e+08611.0208564.52793249.409439NaN1.169540e+04
245TuvaluNaN3.646999e+07563.50059215.50692947.472414NaN1.091000e+04
10Aruba1.663250NaNNaNNaN48.721939NaN1.037444e+05
28British Virgin IslandsNaNNaNNaNNaN47.581520NaN2.958540e+04
38Cayman IslandsNaNNaNNaNNaN49.866695NaN5.915880e+04
42Channel Islands1.462250NaNNaNNaNNaNNaN1.629612e+05
53Curacao2.050000NaNNaNNaN47.986740NaN1.559594e+05
68Eritrea4.324500NaN47.1552261.42990246.331369524.50NaN
81French Polynesia2.051000NaNNaNNaNNaNNaN2.757226e+05
87GibraltarNaNNaNNaNNaNNaNNaN3.402560e+04
122Korea, Dem. People's Rep.1.982500NaNNaNNaN49.03028186.252.511378e+07
137Libya2.485750NaN955.3888613.231360NaN9.006.225309e+06
162MonacoNaNNaN6421.6787153.72008549.637143NaN3.813840e+04
172New Caledonia2.250000NaNNaNNaNNaNNaN2.682000e+05
201San Marino1.260000NaN3437.2987475.75269345.616261NaN3.260740e+04
209Sint Maarten (Dutch part)NaNNaNNaNNaN49.508551NaN3.755220e+04
223St. Martin (French part)1.812500NaNNaNNaNNaNNaN3.149120e+04
233Syrian Arab Republic2.967750NaN269.9457391.50716648.04739462.001.931967e+07
244Turks and Caicos IslandsNaNNaNNaNNaN48.846884NaN3.370340e+04
\n", "

263 rows × 8 columns

\n", "
" ], "text/plain": [ " country fert_rate \\\n", "259 World 2.464282 \n", "98 High income 1.686585 \n", "180 OECD members 1.749418 \n", "193 Post-demographic dividend 1.636470 \n", "103 IDA & IBRD total 2.587557 \n", "140 Low & middle income 2.595258 \n", "160 Middle income 2.361746 \n", "102 IBRD only 2.103185 \n", "72 Europe & Central Asia 1.738094 \n", "61 East Asia & Pacific 1.781424 \n", "251 Upper middle income 1.795244 \n", "177 North America 1.834404 \n", "128 Late-demographic dividend 1.667820 \n", "250 United States 1.860875 \n", "75 European Union 1.570012 \n", "71 Euro area 1.551004 \n", "63 East Asia & Pacific (excluding high income) 1.813950 \n", "62 East Asia & Pacific (IDA & IBRD) 1.811850 \n", "60 Early-demographic dividend 2.636376 \n", "44 China 1.558750 \n", "142 Lower middle income 2.877481 \n", "129 Latin America & Caribbean 2.125164 \n", "130 Latin America & Caribbean (IDA & IBRD) 2.138122 \n", "131 Latin America & Caribbean (excluding high income) 2.140742 \n", "117 Japan 1.430000 \n", "73 Europe & Central Asia (IDA & IBRD) 1.861440 \n", "74 Europe & Central Asia (excluding high income) 1.909466 \n", "85 Germany 1.450000 \n", "157 Middle East & North Africa 2.852531 \n", "249 United Kingdom 1.842500 \n", ".. ... ... \n", "254 Vanuatu 3.364750 \n", "224 St. Vincent and the Grenadines 1.986000 \n", "3 American Samoa NaN \n", "46 Comoros 4.524000 \n", "58 Dominica NaN \n", "239 Tonga 3.745750 \n", "156 Micronesia, Fed. Sts. 3.269250 \n", "202 Sao Tome and Principe 4.603750 \n", "185 Palau 2.220000 \n", "152 Marshall Islands NaN \n", "121 Kiribati 3.745250 \n", "169 Nauru NaN \n", "245 Tuvalu NaN \n", "10 Aruba 1.663250 \n", "28 British Virgin Islands NaN \n", "38 Cayman Islands NaN \n", "42 Channel Islands 1.462250 \n", "53 Curacao 2.050000 \n", "68 Eritrea 4.324500 \n", "81 French Polynesia 2.051000 \n", "87 Gibraltar NaN \n", "122 Korea, Dem. People's Rep. 1.982500 \n", "137 Libya 2.485750 \n", "162 Monaco NaN \n", "172 New Caledonia 2.250000 \n", "201 San Marino 1.260000 \n", "209 Sint Maarten (Dutch part) NaN \n", "223 St. Martin (French part) 1.812500 \n", "233 Syrian Arab Republic 2.967750 \n", "244 Turks and Caicos Islands NaN \n", "\n", " gdp health_exp_per_cap health_exp_pub prim_ed_girls \\\n", "259 7.613006e+13 1223.941243 5.947058 48.076575 \n", "98 4.884635e+13 5045.885008 7.602022 48.701030 \n", "180 4.787743e+13 4566.959377 7.636198 48.704364 \n", "193 4.541806e+13 5124.214162 7.858170 48.649863 \n", "103 2.803020e+13 507.044258 3.013687 47.896337 \n", "140 2.724634e+13 498.193061 2.982075 NaN \n", "160 2.691014e+13 542.340940 2.996480 48.044140 \n", "102 2.607726e+13 625.357428 3.104682 48.026892 \n", "72 2.191519e+13 2518.566323 7.130694 48.653599 \n", "61 2.168128e+13 835.974259 4.687596 47.212490 \n", "251 2.097441e+13 870.897512 3.358153 47.112001 \n", "177 1.908336e+13 8615.535450 8.066182 48.708683 \n", "128 1.862014e+13 838.259199 3.335746 47.046383 \n", "250 1.736912e+13 9060.068657 8.121961 48.758830 \n", "75 1.731910e+13 3448.910224 7.816628 48.658777 \n", "71 1.255692e+13 3913.466364 7.956080 48.610030 \n", "63 1.242383e+13 558.702327 2.815498 47.115173 \n", "62 1.239991e+13 558.711100 2.815573 47.098031 \n", "60 1.019283e+13 392.428268 2.595967 48.651143 \n", "44 1.018279e+13 657.748859 3.015530 46.297964 \n", "142 5.929275e+12 254.618642 1.657578 48.632937 \n", "129 5.845138e+12 1069.141865 3.659621 48.378604 \n", "130 5.636800e+12 1050.010897 3.585924 48.382980 \n", "131 5.377437e+12 1046.116268 3.638974 48.379046 \n", "117 5.106025e+12 3687.126279 8.496074 48.744199 \n", "73 4.213635e+12 1173.135713 3.878399 48.632033 \n", "74 3.710324e+12 1138.169165 3.794867 48.619737 \n", "85 3.601226e+12 4909.659884 8.542615 48.568695 \n", "157 3.380647e+12 912.491552 3.055660 47.621273 \n", "249 2.768864e+12 3357.983675 7.720655 48.791809 \n", ".. ... ... ... ... \n", "254 7.828760e+08 125.568712 3.689874 47.301617 \n", "224 7.301068e+08 775.803386 4.365757 48.536415 \n", "3 6.405000e+08 NaN NaN NaN \n", "46 6.039190e+08 99.221938 2.289039 47.515774 \n", "58 5.130350e+08 572.413734 3.749851 48.774225 \n", "239 4.391789e+08 250.962504 3.987285 47.697931 \n", "156 3.193208e+08 456.403851 12.037277 48.084030 \n", "202 3.145400e+08 304.237533 3.227125 48.664169 \n", "185 2.548400e+08 1389.993077 6.623460 46.425690 \n", "152 1.843189e+08 657.059335 14.302529 48.397282 \n", "121 1.774306e+08 179.372383 8.279184 49.557255 \n", "169 1.063908e+08 611.020856 4.527932 49.409439 \n", "245 3.646999e+07 563.500592 15.506929 47.472414 \n", "10 NaN NaN NaN 48.721939 \n", "28 NaN NaN NaN 47.581520 \n", "38 NaN NaN NaN 49.866695 \n", "42 NaN NaN NaN NaN \n", "53 NaN NaN NaN 47.986740 \n", "68 NaN 47.155226 1.429902 46.331369 \n", "81 NaN NaN NaN NaN \n", "87 NaN NaN NaN NaN \n", "122 NaN NaN NaN 49.030281 \n", "137 NaN 955.388861 3.231360 NaN \n", "162 NaN 6421.678715 3.720085 49.637143 \n", "172 NaN NaN NaN NaN \n", "201 NaN 3437.298747 5.752693 45.616261 \n", "209 NaN NaN NaN 49.508551 \n", "223 NaN NaN NaN NaN \n", "233 NaN 269.945739 1.507166 48.047394 \n", "244 NaN NaN NaN 48.846884 \n", "\n", " mat_mort_ratio population \n", "259 223.75 7.269321e+09 \n", "98 10.00 1.175934e+09 \n", "180 15.00 1.273149e+09 \n", "193 10.75 1.092637e+09 \n", "103 244.75 6.113147e+09 \n", "140 245.25 6.089148e+09 \n", "160 185.75 5.468296e+09 \n", "102 106.25 4.607548e+09 \n", "72 16.75 9.032073e+08 \n", "61 63.00 2.265974e+09 \n", "251 43.25 2.540966e+09 \n", "177 13.50 3.541404e+08 \n", "128 36.00 2.235352e+09 \n", "250 14.00 3.185582e+08 \n", "75 8.00 5.082110e+08 \n", "71 6.50 3.384615e+08 \n", "63 66.75 2.022090e+09 \n", "62 66.75 1.996942e+09 \n", "60 169.00 3.083697e+09 \n", "44 28.75 1.364446e+09 \n", "142 262.50 2.927329e+09 \n", "129 70.75 6.242184e+08 \n", "130 71.50 6.080342e+08 \n", "131 72.75 5.969302e+08 \n", "117 5.75 1.272971e+08 \n", "73 24.00 4.505588e+08 \n", "74 25.50 4.125489e+08 \n", "85 6.25 8.128164e+07 \n", "157 83.50 4.208876e+08 \n", "249 9.25 6.464156e+07 \n", ".. ... ... \n", "254 82.50 2.588964e+05 \n", "224 45.75 1.094206e+05 \n", "3 NaN 5.542200e+04 \n", "46 349.50 7.595556e+05 \n", "58 NaN 7.278540e+04 \n", "239 129.25 1.059094e+05 \n", "156 103.25 1.041180e+05 \n", "202 159.50 1.913326e+05 \n", "185 NaN 2.111260e+04 \n", "152 NaN 5.288280e+04 \n", "121 95.00 1.104816e+05 \n", "169 NaN 1.169540e+04 \n", "245 NaN 1.091000e+04 \n", "10 NaN 1.037444e+05 \n", "28 NaN 2.958540e+04 \n", "38 NaN 5.915880e+04 \n", "42 NaN 1.629612e+05 \n", "53 NaN 1.559594e+05 \n", "68 524.50 NaN \n", "81 NaN 2.757226e+05 \n", "87 NaN 3.402560e+04 \n", "122 86.25 2.511378e+07 \n", "137 9.00 6.225309e+06 \n", "162 NaN 3.813840e+04 \n", "172 NaN 2.682000e+05 \n", "201 NaN 3.260740e+04 \n", "209 NaN 3.755220e+04 \n", "223 NaN 3.149120e+04 \n", "233 62.00 1.931967e+07 \n", "244 NaN 3.370340e+04 \n", "\n", "[263 rows x 8 columns]" ] }, "execution_count": 31, "metadata": {}, "output_type": "execute_result" } ], "source": [ "gender_data.sort_values('gdp', ascending=False)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "As you might have guessed by now, `Series` also have a `sort_values` method. For a Series, you don't have to specify the column to sort from, because you are using the Series values." ] }, { "cell_type": "code", "execution_count": 32, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "259 7.613006e+13\n", "98 4.884635e+13\n", "180 4.787743e+13\n", "193 4.541806e+13\n", "103 2.803020e+13\n", "140 2.724634e+13\n", "160 2.691014e+13\n", "102 2.607726e+13\n", "72 2.191519e+13\n", "61 2.168128e+13\n", "251 2.097441e+13\n", "177 1.908336e+13\n", "128 1.862014e+13\n", "250 1.736912e+13\n", "75 1.731910e+13\n", "71 1.255692e+13\n", "63 1.242383e+13\n", "62 1.239991e+13\n", "60 1.019283e+13\n", "44 1.018279e+13\n", "142 5.929275e+12\n", "129 5.845138e+12\n", "130 5.636800e+12\n", "131 5.377437e+12\n", "117 5.106025e+12\n", "73 4.213635e+12\n", "74 3.710324e+12\n", "85 3.601226e+12\n", "157 3.380647e+12\n", "249 2.768864e+12\n", " ... \n", "254 7.828760e+08\n", "224 7.301068e+08\n", "3 6.405000e+08\n", "46 6.039190e+08\n", "58 5.130350e+08\n", "239 4.391789e+08\n", "156 3.193208e+08\n", "202 3.145400e+08\n", "185 2.548400e+08\n", "152 1.843189e+08\n", "121 1.774306e+08\n", "169 1.063908e+08\n", "245 3.646999e+07\n", "10 NaN\n", "28 NaN\n", "38 NaN\n", "42 NaN\n", "53 NaN\n", "68 NaN\n", "81 NaN\n", "87 NaN\n", "122 NaN\n", "137 NaN\n", "162 NaN\n", "172 NaN\n", "201 NaN\n", "209 NaN\n", "223 NaN\n", "233 NaN\n", "244 NaN\n", "Name: gdp, Length: 263, dtype: float64" ] }, "execution_count": 32, "metadata": {}, "output_type": "execute_result" } ], "source": [ "gdp.sort_values(ascending=False)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "jupytext": { "formats": "", "split_at_heading": true }, "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.7.2" } }, "nbformat": 4, "nbformat_minor": 2 }