Python × 資料分析

Introduction


Kristen Chan

Agenda


  • 資料結構 (Data structure)
  • 資料處理 (Data processing)
  • 資料探索 (Exploratory Data Analysis)
  • 資料視覺化 (Data visualization)
  • 資料分類與分群 (Classification and Clustering)

Tools


  • Python3
  • Anaconda
  • Jupyter Notebook

Tools -- Python3


Why Python

  • 簡單易學
  • 開源免費
  • 跨平台
  • 豐富的函式庫

Tools -- Python3


Applications for Python

  • Web程式 - Django、Flask
  • Game
  • 網頁爬蟲 – Scrapy
  • 資料分析/機器學習 - numpy, scipy, matplotlib
  • 自然語言處理 - nltk

Tools -- Anaconda


anaconda

source:https://www.continuum.io/

Tools -- Jupyter Notebook


Why Jupyter Notebook

  • 互動式計算介面
  • 支援多種程式語言
  • 支援markdown

Tools -- Jupyter Notebook


Jupyter Online

jupyter online

source:http://jupyter.org/

套件


  • Numpy
  • Pandas
  • Matplotlib
  • Scikit-learn

Open your Jupyter Notebook


open Jupyter Notebook

Open your Jupyter Notebook


Jupyter Notebook

Use Jupyter Notebook


  • 執行: Ctrl+Enter or Shift+Enter
  • 編輯模式
    • 修改cell內容
    • 綠色
    • 啟動編輯:Enter
  • 指令模式
    • 使用快捷鍵執行指令
    • 藍色
    • 啟動指令:Esc
  • 快速鍵(無大小寫之分)
    • 上方新增一個cell:A
    • 下方新增一個cell:B
    • 刪除一個cell:DD
    • 複製一個cell:C
    • 貼上一個cell:V
    • 啟動Code模式:Y
    • 啟動Markdown模式:M

第一支Python 程式


輸入程式碼

In [2]:
print ("Hello Python!!")
Hello Python!!

Python 註解

In [3]:
print('Hello Python !!') #我的第一個python程式
Hello Python !!

The Zen of Python

In [2]:
import this
The Zen of Python, by Tim Peters

Beautiful is better than ugly.
Explicit is better than implicit.
Simple is better than complex.
Complex is better than complicated.
Flat is better than nested.
Sparse is better than dense.
Readability counts.
Special cases aren't special enough to break the rules.
Although practicality beats purity.
Errors should never pass silently.
Unless explicitly silenced.
In the face of ambiguity, refuse the temptation to guess.
There should be one-- and preferably only one --obvious way to do it.
Although that way may not be obvious at first unless you're Dutch.
Now is better than never.
Although never is often better than *right* now.
If the implementation is hard to explain, it's a bad idea.
If the implementation is easy to explain, it may be a good idea.
Namespaces are one honking great idea -- let's do more of those!