Python复习笔记

python复习笔记记录一下观看milo在网易云课堂上《疯狂的python》的复习笔记，源视频地址戳这里

第一章：导论

课时1-2：简介

编译python文件,生成pyc文件

1
2
3


#编译1.py文件生成1.pyc的二进制文件
import py_compile
py_compile.compile('1.py')

编译python文件，生成pyo文件

1
2


#pyo为优化的二进制文件
python -O -m py_compile 1.py

课时3：变量

变量赋值

1
2
3


a = 1
b_c = 'abc'
_bc3 = 'a2b3'

python变量特性

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16


#python的变量和C等语言不一样，python是以数据申请内存空间，即同个变量不用数据使用不同的内存空间，使用id函数可以查看数据对应的内存空间。
>>> a=1
>>> a
1
>>> id(a)
22053688
>>> a=2
>>> a
2
>>> id(a)
22053664
>>> b=2
>>> b
2
>>> id(b)
22053664

课时4：运算符和表达式

赋值运算符

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11


>>> a=1
>>> a+=2
>>> a
3
>>> a+=3
>>> a
6
>>> a-=4
>>> a
2
>>>

算术运算符

1
2
3
4


+ - *    基础加减乘
/    要小数点1.0/2  整除3.0//2
%    取余数
**   取幂运算

关系运算符

逻辑运算符

1
2
3


1 < 2 and 2 < 3
1 < 2 or 2 > 3
not 1 > 2

小练习

1
2
3
4


a=int(raw_input('number1 = '))
b=int(raw_input('number2 = '))
c = a+b
print str(a)+' + '+str(b)+' = '+str(c)

第二章：数据类型

主要数据类型五种包括数字、字符串、元组、列表和字典

课时5：数字和字符串

数字类型包括整型、长整型、浮点型、复数型

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15


>>> num1 = 123
>>> type(num1)
<type 'int'>
>>> num2 = 123L
>>> type(num2)
<type 'long'>
>>> num3 = 99999999999999999999
>>> type(num3)
<type 'long'>
>>> num4 = 1.23
>>> type(num4)
<type 'float'>
>>> num5 = 1.23j
>>> type(num5)
<type 'complex'>

字符串类型可以使用单引号、双引号、三单引定义

1
2
3
4
5
6
7
8
9


>>> a = 123
>>> stra = '123'
>>> type(a)
<type 'int'>
>>> type(stra)
<type 'str'>
>>> strb = "let's say \"hello\""
>>> print strb
let's say "hello"

字符串切片

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17


>>> a = 'abcde'
>>> a[1:4]
'bcd'
>>> a[:]
'abcde'
>>> a[:4]
'abcd'
>>> a[4:]
'e'
>>> a[::1]
'abcde'
>>> a[::2]
'ace'
>>> a[-1]
'e'
>>> a[-4:-1]
'bcd'

课时6：元组

列表、元组和字符串都是序列，序列的主要特点是索引操作符合切片操作符

序列的基本操作

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22


len()  #求序列长度
+  #连接两个序列
*  #重复序列元素
in #判断元素是否在序列中
max()  #返回最大的值
min()  #返回最小的值
cmp(tuple1, tuple2)    #比较两个序列值是否相同
#EX---
>>> str1 = 'abcde'
>>> str2 = '12345'
>>> str1 + str2
'abcde12345'
>>> str1 * 5
'abcdeabcdeabcdeabcdeabcde'
>>> 'a' in str1
True
>>> max(str1)
'e'
>>> min(str2)
'1'
>>> cmp(str1, str2)
1

元组通过小括号定义()，元组和列表类似，不过元组被使用元组的值也不改变。

1
2
3
4
5
6


>>> userinfo1=("zou", 31, "female")
>>> userinfo1[1]
31
>>> userinfo1[0]
'zou'
#单一元素的元组定义时要加逗号

课时7：列表

列表操作方法通过中括号定义[]，列表是可变类型的数据

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15


listmilo=['arvon', 24, 'male']
#取值
age=listmilo[1]
#添加
listmilo.append('Linux')
#删除
listmilo.remove(listmilo[2])
del(listmilo[2])
#修改
listmilo[1]=18
#查看
>>> listmilo ['arvon', 18, 'Linux']
#查找
>>> "Linux" in listmilo
True

对象和类快速入门对象=属性+方法

课时8：字典

使用花括号定义，字典是python中唯一的映射类型（哈希表），字典的对象是可变的，但字典的键必须使用不可变对象，并且一个字典中可以使用不同类型的键 keys()或者values()返回键列表或者值列表，items()返回包含键值对的元组。

例子可以直接使用key访问，key不存在会报错，可以使用had_key()方法或者in/not in来判断，另had_key()方法即将弃用

1
2
3
4
5


>>> a=123
>>> b=456
>>> dic4={a:'aaa',b:'bbb','c':'ccc'}
>>> dic4
{456: 'bbb', 'c': 'ccc', 123: 'aaa'}

使用dict方法生成字典和使用fromkeys生成字典

1
2


fdict=dict(['x',1],['y',2])
ddict={}.fromkeys(('x','y'),-1)

字典的添加和删除字典是无序的所以可以任意添加元素，列表就不行

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20


#添加和修改元素元素
>>> dic1={'name':'arvon','age':24,'work':'BJ'}
>>> dic1['tel']=123456
>>> dic1
{'age': 24, 'work': 'BJ', 'tel': 123456, 'name': 'arvon'}
#删除元素
dict1.clear()  #删除dict1字典的所有元素
del dict1  #删除dict1这个字典
#取值
>>> dic1={'a':1, 'b':2, 'c':3}
>>> dic1['a']
1
>>> dic.get('c')
3
#返回字典可key列表和values列表
>>> dic1={'a':1, 'b':2, 'c':3}
>>> dict.keys(dic1)
['a', 'c', 'b']
>>> dict.values(dic1)
[1, 3, 2]

第三章：流程控制

主要包括判断和循环

课时9：分支结构

逻辑值（bool）用来表示诸如：对和错、真和假、空与非空等概念。逻辑值True表示非空的量如（string，tuple，list，set，dictonary等），逻辑值False表示0，None，空的量等。

if else判断

1
2
3
4


if 1<2:
   print 'Yes'
else:
   print 'No'

elif 多条件判断

1
2
3
4
5


a=raw_input("Input a num: ")
if 1 < a < 3:
    print str(a)+" is 1-3"
elif a >= 3:
    print str(a)+" is 3-*"

课时10：逻辑运算符

逻辑运算符包括"and",“or”,“not”

无聊的例子

1
2
3
4
5


a = 5
if a > 1 and a !=2:
    if a==4 or 1<2:
        if not a !=5:
            print "Oh"

有用的not

1
2
3
4
5


def fun():
    return 0

if not fun():
    print "ok"  

课时11：for循环

使用for可以循环字符串、元组和列表

使用for循环字符串

1
2


for i in 'abcde':
   print i

使用range方法快速生成序列

1
2
3


#第一参数为开始值不设置默认为0，第二个为结束值，第三个为步长不写默认为步值为1
for i in range(0,100,2):
   print i

小题：计算1到100累加的值

1
2
3
4


num=0
for x in range(1,101):
    num=num+x
print num

课时12：遍历

遍历对象可以是字符串、元组、列表

使用索引遍历

1
2


for x in range(len("hello")):
   print "hello"[x]

字典的遍历

1
2
3
4


d = {1:111, 2:222, 4:444, 3:333}
for x in d:
    print x
    print d[x]

字典元组拆分法

1
2
3
4
5
6


d = {1:111, 2:222, 4:444, 3:333}
f = d.items()
print f
for k,v in f:
    print k
    print v

课时13：循环控制

主要使用for和while进行循环控制

Python特殊的for循环，在python中for循环是可以有else的

1
2
3
4
5


#在循环遍历结束时最后会打印一次ending，如果程序未正常遍历结束则不会触发
for i in range(3):
   print i
else:
   print "ending"

使用break跳出循环

1
2
3
4
5
6
7


for i in range(5):
    print "hello"+str(i)
    if i == 3:
        print i
        break
else:
   print "ending"

使用continue跳出本次循环

 1
 2
 3
 4
 5
 6
 7
 8
 9
10


for i in range(5):
    print "hello"+str(i)
    if i == 2:
        print "22222222"
        continue
    if i == 3:
        print i
        break
else:
   print "ending"

使用pass进行占位操作（代码桩）

1
2
3


for i in range(5):
    pass
print "Go"

使用exit退出程序

1
2
3


for i in range(5):
    print i
    exit

课时14：while循环

主要做条件循环，直到表达式为假跳出循环,在设计while循环时一定要让有条件退出

最简单的死循环

1
2
3


#while 1:
while True:
    print "hello"

使用条件退出while

1
2
3
4
5


while True:
    print "Haha"
    x = raw_input("Input q for quit: ")
    if x == 'q':
        break

使用while表达式进行条件判断退出

1
2
3
4
5
6


x = ''
while x != 'q':
    print "hello"
    x = raw_input("please input a str, q for quit: ")
    if not x:
        break

while中的else

1
2
3
4
5
6
7
8


x = ''
while x != 'q':
    print "hello"
    x = raw_input("please input a str, q for quit: ")
    if not x:
        break
else:
    print "ending"

第四章：函数

函数就是完成特定功能的语句组，可以通过函数名在程序不同位置多次执行（函数调用）。

课时15：函数定义和调用

使用def定义函数，括号里面是参数列表()

1
2
3
4
5


def add(a,b):
    c = a + b
    print c
add(1,2)
add(3,4)

基本函数和返回值初探

1
2
3
4
5
6
7
8
9


a = 100
def fun():
    if False:
        print "Hello"
    print a
    return 0
#fun()
if not fun():
    print "ok"

课时16：函数形参、实参、默认参数

在定义函数时函数名后面括号中的变量叫做形参，在调用函数时函数名后面的括号中的变量叫做实参

简单例子

1
2
3
4
5


def fun(x):
        print "ok"
        print x
s = raw_input("Input something: ")
fun(s)

默认参数例子

1
2
3
4
5
6


def fun(x,y='lalala'):
        print x,y
s = raw_input("Input something: ")
fun(4)
fun(y='cacaca')
fun(2,'goog')

课时17：变量作用域

在python中任何变量都有其特定的作用域，一般在函数中定义的变量只能在函数内部使用，也叫局部变量。在一个文件顶部定义的变量可以提供给该文件中的任何函数调用，也叫全局变量。

一个例子

1
2
3
4
5
6
7
8


a = 150
def fun():
    a = 100
    print  'in',a
    #这里打印出的是100
fun()
print 'out',a
#out打印出是150

使用global将局部变量声明为全局变量

1
2
3
4
5
6
7
8
9


a = 150
def fun():
    a = 100
    global b
    b = 12345
    print  'in',a
fun()
print 'out',a
print b

课时18：函数返回值

函数被调用后会返回一个指定的值即返回值，不指定默认返回None，可以使用return直接指定，返回值可以是任意类型，return执行后函数终止

一个例子

1
2
3
4
5
6
7


#coding:utf8
def f(x,y):
    t=x+y
    return t
z = f(2,3)
print z
#此时z为None

课时19：冗余参数处理

正常多类型传值

1
2
3
4
5
6
7


def f(x):
    print x
f(1)
f('abc')
f(['arvon','mo'])
f({'arvon':123,'blog':'arvon.top'})
f(range(10))

传递元组到多个参数,号和**的使用传递元组使用，传递字典使用**，推荐使用**，原因看例子

1
2
3
4
5
6
7
8
9


def f(name='name',age='0'):
    print 'name: %s' % name
    print 'age: %s' % age
f()
f('test',12)
t=('arvon',24)
tt={'age':23, 'name':'mo'}
f(*t)
f(**tt)

冗余例子,args的使用关于args的意思是接收多余的参数，把这些参数当做一个元组，这个元组名称为args,使用**args冗余字典方式的参数

1
2
3
4
5
6
7
8


def go(x,*args,**dargs):
    print x
    print args
    print dargs
go(1,2)
go(1,2,3)
go(1,2,3,4)
go(1,2,3,4,'m'=5)

课时20：匿名函数lambda

lambda表达式函数是一种快速定义单行的最小函数，从Lisp借用而来，可使用在任何需要函数的地方。

最简实例

 1
 2
 3
 4
 5
 6
 7
 8
 9
10


>>> def f(x,y):
...     return x*y
...
>>> f(2,3)
6
>>> lambda x,y:x*y
<function <lambda> at 0x7fa5c3e7ea28>
>>> g = lambda x,y:x*y
>>> g(2,3)
6

reduce函数的使用括号中第一项为函数，第二项为列表，一次作用两个值，配合lambda很好用。

1
2
3
4
5
6
7


l = range(1,6)
def f(x,y):
    return x*y
one=reduce(f,l)
g = lambda x,y:x+y
two=reduce(g,l)
print one,two

课时21：实现分支结构

switch语句用于编写多分支结构的程序类似if else，但python并未提供switch语句。在python当中使用字典来实现相同的功能。

精简的例子

1

{1:case1,2:case2}.get(x,lambda *arg, **key:)()

通过字典调用函数

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18


#coding:utf-8
from __future__ import division
def jia(x,y):
    return x+y
def jian(x,y):
    return x-y
def cheng(x,y):
    return x*y
def chu(x,y):
    return x/y
operator = {'+':jia,'-':jian,'*':cheng,'/':chu}
#print operator['+'](3,2)
#print operator['/'](3,2)
#print jia(3,2)
def fff(x,o,y):
    print operator.get(o)(x,y)
fff(3,'+',2)
fff(3,'/',2)

上个例子的复杂方式，有多余的判断方便和上面例子对比

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24


#coding:utf-8
from __future__ import division
def jia(x,y):
    return x+y
def jian(x,y):
    return x-y
def cheng(x,y):
    return x*y
def chu(x,y):
    return x/y
def operator(x,o,y):
    if o == '+':
        print jia(x,y)
    elif o == '-':
        print jian(x,y)
    elif o == '*':
        print cheng(x,y)
    elif o == '/':
        print chu(x,y)
    else:
        pass
operator(4, '+', 2)
operator(4, '-', 2)
operator(4, '*', 2)

课时22：常用内置函数

使用callable检测是否可以直接调用函数

1
2
3
4
5
6
7
8


>>> callable(min)
True
>>> callable(f)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
NameError: name 'f' is not defined
>>> callable(divmod)
True                 

使用abs取绝对值

1
2
3
4
5
6
7
8


#def a(x):
#    if x < 0:
#        return -x
#    else:
#        return x
#print a(10)
#print a(-9)
print abs(-9)

使用max取最大值,使用min取最小值

1
2
3


l = range(12)
print max(l)
print min(l)

取列表长度

1
2
3
4
5
6


#coding:utf8
l=[1, 2, 4, 5, 6]
#取列表元素个数
print len(l)
#取商和摩
print divmod(5,2)

测试数据类型是否相同

1
2
3
4


#if type(l) == type([]):
isinstance(l,list)
isinstance(l,int)
isinstance(l,str)

使用cmp判断字符串是否一样

1
2
3


# if l == 'strxxx':
cmp(l,'strxxx')
#相同返回0，0在判断时为假，需使用not

类型转换函数

 1
 2
 3
 4
 5
 6
 7
 8
 9
10


long() #长整形
int()
float()
str()
list()  #列表
tuple() #元组
hex()  #16进制转换
oct()  #8进制转换
chr()  
ord()  

课时23：与类相关的内置函数

与string相关的几个函数

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15


#首字母大写
>>> s = 'hello world'
>>> s.capitalize()
'Hello world'
#字符串替换,查看帮助使用help(str.capitalize)
>>> s.replace('hello','good')
'good world'
#字符串切割
>>> ip='192.168.1.123'
>>> ip.split('.')
['192', '168', '1', '123']
>>> ip.split('.',1)
['192', '168.1.123']
>>> ip.split('.',2)
['192', '168', '1.123']

直接使用内置函数跟使用import导入的小区别

1
2
3
4
5


s = 'hello world'
s.replace('hello','good')
##---使用import
import string
string.replace(s,'hello','good')

使用filter函数过滤

1
2
3
4
5
6


#filter（function，list），会把函数判断为Ture时list的元素取出来
l=range(10)
def f(x):
    if x > 5:
        return True
print filter(f,l)

课时24：序列处理函数

使用zip或map进行并行遍历使用zip只能对元素数量一样的，map可以将数量不同的地方用None代替

1
2
3
4
5
6
7


>>> name=['arvon', 'mo', 'lover']
>>> age=['23', '25', '26']
>>> tel=['123456', '324566', '54334123']
>>> zip(name,age,tel)
[('arvon', '23', '123456'), ('mo', '25', '324566'), ('lover', '26', '54334123')]
>>> map(None,name,age,tel)
[('arvon', '23', '123456'), ('mo', '25', '324566'), ('lover', '26', '54334123')]

map函数高阶用法，可以对遍历后的数据进行函数操作

1
2
3
4
5
6
7
8
9


>>> a=[1,3,5]
>>> b=[2,4,6]
>>> def mf(x,y):
...     return x*y
...
>>> map(None,a,b)
[(1, 2), (3, 4), (5, 6)]
>>> map(mf,a,b)
[2, 12, 30]

reduce阶乘例子

 1
 2
 3
 4
 5
 6
 7
 8
 9
10


>>> l=range(1,100)
>>> def xf(x,y):
...     return x +y
...
>>> reduce(xf,l)
4950
>>> reduce(lambda x,y:x+y,l)
4950
>>> filter(lambda x:x%2 == 0,l)
[2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30, 32, 34, 36, 38, 40, 42, 44, 46, 48, 50, 52, 54, 56, 58, 60, 62, 64, 66, 68, 70, 72, 74, 76, 78, 80, 82, 84, 86, 88, 90, 92, 94, 96, 98]

课时25：模块和包

模块是python组织代码的基本方式，python脚本都是以py为扩展名的文件保存，一个脚本可以单独运行也可以导入另一个脚本运行，当导入运行时，被导入的脚本就称作模块（module）。模块名与脚本名字相同，如test.py的模块名就是test可以通过import test导入。

python找寻模块路径优先级

1
2
3
4


#当前目录>lib下>其他
#查找导入模块的路径
import a_module
print a_module.__file__

实用的__name__,python内置当直接运行脚本返回为__main__，当被调用执行返回为脚本名。

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25


from __future__ import division
def jia(x,y):
    return x+y
def jian(x,y):
    return x-y
def cheng(x,y):
    return x*y
def chu(x,y):
    return x/y
def operator(x,o,y):
    if o == '+':
        print jia(x,y)
    elif o == '-':
        print jian(x,y)
    elif o == '*':
        print cheng(x,y)
    elif o == '/':
        print chu(x,y)
    else:
        pass
if __name__ == '__main__':
    operator(4, '+', 2)
    operator(4, '-', 2)
    operator(4, '*', 2)
    operator(4, '/', 2)

简单例子

1
2
3


import cal
print cal.jia(1,2)
#实用cal调用模块，实用.符号调用模块内的函数也叫方法

1
2
3
4
5


###创建一个包的步骤
#- 建立一个名字为包名字的目录
#- 在该目录下创建一个__init__.py文件
#- 根据需要在该目录下存放脚本文件、已编译扩展及子包
#- import pack.m1, pack.m2, pack.m3

第五章：正则表达式

课时26：简介

正则表达式（RE）是一种小型的、高度专业化的编程语言，它内嵌在python中，并通过re模块实现。

小例子

1
2
3
4
5
6
7


>>> import re
>>> s = 'abc'
>>> s = r'abc'
>>> re.findall(s,'aaaaaaaa')
[]
>>> re.findall(s,'aaaaaabcaa')
['abc']

课时27：元字符

普通字符包括大多数字母和字符以及数字等都匹配自身

1
2
3
4


>>> st = 'top tip tqp twp tep'
>>> res=r'top'
>>> re.findall(res,st)
['top']

元字符包括.^$*+?{}[]|()

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13


>>> str='I can say "tip top t4p world ^go" and I say hello world'
>>> rs1 = r'I'
>>> re.findall(rs1,str)
['I', 'I']
>>> rs2 = r'^I'
>>> re.findall(rs2,str)
['I']
>>> rs3 = r't[io]p'
>>> re.findall(rs3,str)
['tip', 'top']
>>> rs4 = r'\^go'
>>> re.findall(rs4.str)
[^go]

关于转义的列表

1
2
3
4
5
6


\d #匹配任何十进制数相当于[0-9]
\D #匹配任何非数字字符，相当于[^0-9]
\s #匹配任何空白字符，相当于[\t\n\r\f\v]
\S #匹配任何非空白字符，相当于[^\t\n\r\f\v]
\w #匹配任何字母数字字符，相当于[a-zA-Z0-9_]
\W #匹配任何非字符数字字符，相当于[^a-zA-Z0-9_]

关于重复的正则其中*表示0次或多次、+表示一次或多次，？表示0次或一次，".“表示匹配一次

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18


>>> tel = '010-123456'
>>> rs = r'010-\d{6}'
>>> re.findall(rs,tel)
['010-123456']
>>> re.findall(rs,tel)
['010-123456']
>>> rs = r'010-\d*'
>>> re.findall(rs,tel)
['010-123456']
>>> rs = r'010-\d+'
>>> re.findall(rs,tel)
['010-123456']
>>> rs = r'010-\d{5}?'
>>> re.findall(rs,tel)
['010-12345']
>>> rs = r'010-\d?'
>>> re.findall(rs,tel)
['010-1']

关于花括号灵活用法

1
2


#可控制匹配次数，如1到5此
rs = r'a{1,5}'

课时28：正则表达式常用函数

正则表达式编译执行，经常用的话建议采用这种方法

1
2
3
4
5
6
7


>>> import re
>>> r1 = "\d{3,4}-?\d{6}"
>>> p_tel = re.compile(r1)
>>> p_tel
<_sre.SRE_Pattern object at 0x7f44193a2f10>
>>> re.findall(p_tel,'010-123456')
['010-123456']

match和search方法

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14


#group()返回被re匹配的字符串
#start()返回匹配开始的位置
#end()返回匹配结束的位置
#span()返回一个元组包含匹配的位置
>>> import re
>>> csvt_re = re.compile(r'csvt',re.I)
>>> csvt_re.findall('csVt')
['csVt']
>>> csvt_re.findall('csVt csvt CsVt')
['csVt', 'csvt', 'CsVt']
>>> csvt_re.match('csvt hello')
<_sre.SRE_Match object at 0x7f44192f1648>
>>> csvt_re.search('csvt hello')
<_sre.SRE_Match object at 0x7f44192f16b0>

sub函数替换字符串

1
2
3
4
5
6
7
8
9


>>> s
'hello csvt'
>>> s.replace('csvt','good')
'hello good'
>>> s
'hello csvt'
>>> rs = r'c..t'
>>> re.sub(rs,'python','csvt caat cvvt cccc')
'python python python cccc'

使用re.split进行带正则的分割

1
2
3
4
5
6


>>> ip = '1.2.3.4'
>>> ip.split('.')
['1', '2', '3', '4']
>>> s = "123*456-789+000"
>>> re.split(r'[\+\-\*]',s)
['123', '456', '789', '000']  

课时29：正则表达式内置属性及分组

正则编译标志

1
2
3
4
5


DOTALL,S    #使.匹配包括换行在内的所有字符
IGNORECASE,I    #使匹配对大小写不敏感
LOCALE,L    #做本地化识别，匹配法语等。。。
MULTILINE,M #多行匹配，影响^和$
VERBOSE,X   #能够使用REs的verbose状态，使之被组织更清晰易懂

关于S的例子

 1
 2
 3
 4
 5
 6
 7
 8
 9
10


>>> r1 = r"csvt.net"
>>> re.findall(r1,'vsvt.net')
[]
>>> r1 = r"csvt.net"
>>> re.findall(r1,'csvt.net')
['csvt.net']
>>> re.findall(r1,'csvtonet')
['csvtonet']
>>> re.findall(r1,'csvtonet',re.S)
['csvtonet']

关于M的例子

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11


>>> s ="""
... hello csvt
... csvt hello
... hello csvt hello
... csvt hehe
... """
>>> r = r"^csvt"
>>> re.findall(r,s)
[]
>>> re.findall(r,s,re.M)
['csvt', 'csvt']

关于X的例子，当正则是多行时使用

1
2
3
4
5
6
7
8
9


>>> tel = r"""
... \d{3,4}
... -?
... \d{8}
... """
>>> re.findall(tel,'010-12345678')
[]
>>> re.findall(tel,'010-12345678',re.X)
['010-12345678']

分组匹配

1
2
3
4
5
6
7
8
9


>>> email = r"\w{3}@\w+(\.com|\.cn)"
>>> re.match(email,'zzz@csvt.cn')
<_sre.SRE_Match object at 0x7f44192fca08>
>>> re.match(email,'zzz@csvt.com')
<_sre.SRE_Match object at 0x7f44192fc990>
>>> re.match(email,'zzz@csvt.org')
>>> re.findall(email,'zzz@csvt.com')
#使用findall会优先返回分组匹配的数据，所以一般用match做判断即可
['.com']

利用分组特性的例子

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12


>>> s ="""
... hhsdj dskj hello src=arvon yes jdasdfa
... adsfasd src=mo yes dasfasdf
... src=lover
... hello src=python asdfas
... """
>>> r1 = r"hello src=.+ "
>>> re.findall(r1,s)
['hello src=arvon yes ', 'hello src=python ']
>>> r1 = r"hello src=(.+) +yes"
>>> re.findall(r1,s)
['arvon']

课时30：一个小爬虫

下载贴吧或空间中所有图片

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21


#!/usr/bin/python
#coding:utf-8
import re
import urllib
def getHtml(url):
    page = urllib.urlopen(url)
    html = page.read()
    return html
def getImg(html):
    reg = r'src="(.*?\.jpg)" size'
    imgre = re.compile(reg)
    imglist = re.findall(imgre,html)
    #print imglist
    imgnum = 0
    for imgurl in imglist:
        urllib.urlretrieve(imgurl,'%s.jpg' % imgnum)
        imgnum +=1
#wantUrl = raw_input('Input URL: ')
wantUrl = 'http://tieba.baidu.com/p/4637471656'
html = getHtml(wantUrl)
getImg(html)

课时31：数据结构之深拷贝和浅拷贝

python对内存的使用，浅拷贝就是对引用的拷贝，而深拷贝是对对象资源的拷贝

实例特点

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42


>>> import copy
>>> a = [1, 2, 3, ['a', 'b', 'c']]
>>> b=a
>>> c = copy.copy(a)
>>> a
[1, 2, 3, ['a', 'b', 'c']]
>>> b
[1, 2, 3, ['a', 'b', 'c']]
>>> c
[1, 2, 3, ['a', 'b', 'c']]
>>> id(a)
140296085124304
>>> id(b)
140296085124304
>>> id(c)
140296085139680
>>> a.append('d')
>>> a
[1, 2, 3, ['a', 'b', 'c'], 'd']
>>> b
[1, 2, 3, ['a', 'b', 'c'], 'd']
>>> c
[1, 2, 3, ['a', 'b', 'c']]
>>> id(a[0])
30618424
>>> id(c[0])
30618424
>>> a[3].append('d')
>>> a
[1, 2, 3, ['a', 'b', 'c', 'd'], 'd']
>>> c
[1, 2, 3, ['a', 'b', 'c', 'd']]
>>> d = copy.deepcopy(a)
>>> a
[1, 2, 3, ['a', 'b', 'c', 'd'], 'd']
>>> d
[1, 2, 3, ['a', 'b', 'c', 'd'], 'd']
>>> a[3].append('e')
>>> a
[1, 2, 3, ['a', 'b', 'c', 'd', 'e'], 'd']
>>> d
[1, 2, 3, ['a', 'b', 'c', 'd'], 'd']

高级功能

课时32：文件读写

文件的读写，使用open或file函数实现

 1
 2
 3
 4
 5
 6
 7
 8
 9
10


#usage: file_handler = open(filename, mode)
#mode
r  #只读，默认
r+ #读写
w  #写入，先删除原文件，再重新写入，如果没有就创建文件
w+ #读写，先删除源文件，如果文件没有就创建（可以写入输出）
a  #写入，在文件末尾追加新的内容，文件不存在就创建
a+ #读写，在文件末尾追加新的内容，文件不存在就创建
b  #打开二进制文件，可以与r，w，a，+结合使用
U  #支持所有的换行符号。如\r,\n,\r\n

使用open和file打开、读取、关闭文件

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16


>>> fo = open('/data/python/tmp/file/test.txt')
>>> fo.read()
'hello world\n'
>>> fo.close()
>>> fo.read()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: I/O operation on closed file
>>> fo1 = file('/data/python/tmp/file/test.txt')
>>> fo1.read()
'hello world\n'
>>> fo1.close()
>>> fo1.read()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: I/O operation on closed file

使用write对文件写入

1
2
3
4
5
6


>>> fnew = open('new.txt', 'w')
>>> fnew.write("hello world\nMy name is arvon\n")
>>> fnew.close()
>>> rnew = open('new.txt')
>>> rnew.read()
'hello world\nMy name is arvon\n'

课时33：文件对象方法

文件对象方法

1
2
3
4
5
6
7
8
9


FileObject.close() #关闭文件
String = FileObject.readline([size])
List = FileObject.readlines([size])
String = FileObject.read([size])
FileObject.next()
FileObject.write(string)
FileObject.writelines(List)
FileObject.seek(偏移量，选项)
FileObject.flush()

使用for遍历文件行

 1
 2
 3
 4
 5
 6
 7
 8
 9
10


>>> for i in open('new.txt'):
...     print i
...
hello world

My name is arvon

go go go

>>>

使用readline读取行,使用readlines返回列表

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12


>>> f1 = open('new.txt')
>>> f1.readline()
'hello world\n'
>>> f1.readline()
'My name is arvon\n'
>>> f1.readline()
'go go go\n'
>>> f1.readline()
''
>>> f1 = open('new.txt')
>>> f1.readlines()
['hello world\n', 'My name is arvon\n', 'go go go\n']

使用next，返回当前行，并将指针指到下一行

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13


#readline会读取结束会读取空字符串，而next不会
>>> f1 = open('new.txt')
>>> f1.next()
'hello world\n'
>>> f1.next()
'My name is arvon\n'
>>> f1.next()
'go go go\n'
>>> f1.next()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
StopIteration
>>>

使用writelines实现多行写入可以多行写，效率比write高，速度快

1
2
3
4
5
6
7
8


>>> f1 = open('new.txt', 'a')
>>> l = ['one\n', 'two\n', 'three\n']
>>> f1 = open('new.txt', 'a')
>>> f1.writelines(l)
>>> f1.close()
>>> f2 = open('new.txt')
>>> f2.read()
'hello world\nMy name is arvon\ngo go go\none\ntwo\nthree\n'

关于指针seek简单操作

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13


#说明FileObject.seek(偏移量，选项)
#选项=0，表示将文件指针指向文件头部到偏移量字节处
#选项=1，表示将文件指针指向文件的当前位置，向后移动偏移量字节
#选项=2，表示将文件指针指向从文件的尾部，向前移动偏移量字节
>>> f2 = open('new.txt')
>>> f2.read()
'hello world\nMy name is arvon\ngo go go\none\ntwo\nthree\n'
>>> f2.read()
''
>>> f2.seek(0,0)
>>> f2.seek(0,0)
>>> f2.read()
'hello world\nMy name is arvon\ngo go go\none\ntwo\nthree\n'

使用flush提交更新，可以在不使用close的情况下查看文件的写入情况

1
2
3


>>> f1=open('new.txt','w')
>>> f1.writelines(l)
>>> f1.flush()

查找hello的个数

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18


import re
f1 = open('a.t')
print len(re.findall('hello',f1.read()))
f1.close
#----two
>>> f1 = open('a.t')
>>> f1.read()
'hello world\nhello hello world\n'
>>> re1 = r'(hello) '
>>> import re
>>> f1.seek(0,0)
>>> re.findall(re1,f1.read())
['hello', 'hello', 'hello']
>>> re.findall(re1,f1.read())
[]
>>> f1.seek(0,0)
>>> len(re.findall(re1,f1.read()))
3

文件内容替换

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12


f1 = open('a.t')
f2 = open('a2.t')
for i in f1:
    f2.write(i.replace('hello', 'csvt')
f1.close()
f2.close()
#---2
fp1 = file('a.t', 'w+')
s = f1.read()
f1.seek(0,0)
f1.write(s.replace("hello", "csvt"))
fp1.close()

课时34：OS模块

os模块常用函数

1
2
3
4
5
6
7
8


mkdir(path[,mode=0777])
makedirs(name,mode=511)
rmdir(path)
removedirs(path)
listdir(path)
getcwd()
chdir(path)
walk(top,topdown=True, onerror=None)

实例参照

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16


#codeing:utf8
import os
#创建单个目录= mkdir
os.mkdir('./mydir')
#创建多级目录= mkdir -p
os.mkdirs('./a/b/c')
#删除空目录
os.rmdir('./mydir')
#删除多级空目录
os.rmdirs('./a/b/c')
#列出当前目录下文件,不包含子目录= ls
os.listdir('.')
#获取当前路径= pwd
os.getcwd('.')
#切换目录= cd
os.chdir('./a')

课时35：目录遍历

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11


import os
def dirList(path):
    filelist = os.listdir(path)
    fpath = os.getcwd()
    for filename in filelist:
        filepath = os.path.join(fpath,path,filename)
        if os.path.isdir(filepath):
            dirList(filepath)
        else:
            print filepath
dirList('testdir')

利用walk模块递归

1
2
3
4
5
6


import os
allDate = os.walk('testdir')
for dirpath,zidir,filenames in allDate:
    for eachfile in filenames:
        eachfilepath = os.path.join(dirpath,eachfile)
        print  eachfilepath

课时36：异常处理

常见python异常

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13


AssertionError #assert语句失败
AttributerError    #试图访问一个对象没有的属性
IOError    #输入输出异常，基本是无法打开文件
ImportError    #无法引入模块或包，基本是路径问题
IndentationError   #语法错误，代码没有正确对齐
IndexError #下标索引超出序列边界
KeyError   #试图访问字典中不存在的键
KeyboardInterrupt  #Ctrl-c终止
NameError  #使用一个还未赋予对象的变量
SyntaxError    #python代码逻辑语法错误
TypeError  #传入的对象类型与要求不符
UnboundLocalError  #试图访问一个还未设置的全局变量，基本上由于另有一个同名全局变量
ValueError #传入一个不被期望的值，即使类型正确

异常及异常抛出使用try时报错会终止执行错误语句以下的语句

1
2
3
4
5
6
7
8
9


#coding:utf8
filename = raw_input("要操作的文件：")
try:
   open(filename)
   print filename
except IOError,msg:
    print "该文件不存在"
except NameError,msg:
    pass

finally子句，不关心捕获什么异常，代码必须执行，如文件关闭、释放锁、把数据库连接返还给连接池等。

1
2
3
4
5
6
7
8


try:
    f = open(filename)
    print hello
except IOError,msg:
    pass
finally:
    f.close()
    print "ok"

使用raise抛出异常,抛出的异常类型必须是python中已定义的类型，不能随意起名

1
2
3


filename = raw_input("something: ")
if filename == "hello":
    raise TypeError("nothing!!!!")

课时37：mysql数据库模块

安装MySQL-python模块

1

yum install MySQL-python

使用MySQLdb模块,交互模式下

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32


>>> import MySQLdb
>>> conn = MySQLdb.connect(user='root',passwd='admin',host='127.0.0.1')
>>> cur = conn.cursor()
>>> conn.select_db('test')
>>> cur.execute("insert into mytable(id,username) value(2,'mo');")
1L
>>> sqli = "insert into mytable(id,username) value(%s, %s);"
>>> cur.execute(sqli,(3,'lover'))
>>> sqlim = "insert into mytable(id,username) values(%s,%s);"
>>> cur.executemany(sqli,[(4,'haha'),(5,'papa'),(6,'dada')])
3L
>>> cur.execute('delete from mytable where id=4')
1L
>>> cur.execute("update mytable set username='gogo' where id=5")
1L
>>> cur.execute("select * from mytable")
6L
>>> cur.fetchone()
(1L, 'arvon')
>>> cur.execute("select * from mytable")
6L
>>> cur.fetchone()
(1L, 'arvon')
>>> cur.fetchone()
(1L, 'arvon')
>>> cur.scroll(0,'absolute')
>>> cur.fetchone()
(1L, 'arvon')
>>> cur.fetchmany(cur.execute("select* from mytable"))
((1L, 'arvon'), (2L, 'mo'), (3L, 'lover'), (5L, 'gogo'), (6L, 'dada'), (7L, 'dudu'))
>>> cur.close()
>>> conn.close()

在脚本中使用的例子

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30


import MySQLdb
#coding:utf-8
#mysql>create table mytable (id int , username char(20));
conn = MySQLdb.connect(user='root',passwd='admin',host='127.0.0.1')
#连接到数据库服务器
cur = conn.cursor()
#连接到数据库后游标的定义
conn.select_db('test')
#连接到test数据库
cur.execute("insert into mytable(id,username) value(2,'mo');")
#插入一条数据
sqlim = "insert into mytable(id,username) values(%s,%s);"
cur.executemany(sqli,[(4,'haha'),(5,'papa'),(6,'dada')])
#使用格式化字符串，一次添加多条数据，同理可应用于修改和删除
cur.execute('delete from mytable where id=4')
#删除一条数据
cur.execute("update mytable set username='gogo' where id=5")
#修改一条数据
cur.execute("select * from mytable")
cur.fetchone()
cur.scroll(0,'absolute')
cur.fetchmany()
#查询一条数据，先select出数据条目数量，再通过fetchone依次取值,取值完成后可以通>过scroll重新定义游标位置，如上为让游标在到开头，使用getchmany可以以元组形式取出
所有值
cur.fetchmany(cur.execute("select* from mytable"))
#使用这种方法可以直接取出所有值
cur.close()
#关闭游标
conn.close()
#关闭数据库连接

课时38：面向对象编程之类和对象

在python中将所有类型都当做对象

类和对象

面向过程和面向对象的编程面向过程的编程：函数式编程、c程序等面向对象的编程：C++，Java，Python等

类和对象是面向对象中的两个重要概念类：是对事物的抽象，如汽车模型对象：是类的一个实例，如轿车、客车
范例说明汽车模型可以对汽车的特征和行为进行抽象，然后可以实例化为一台真实的汽车实体出来。
Python类定义

Python类的定义使用class关键字定义一个类，并且类名的首字母要大写；当程序员需要创建的类型不能用简单类型表示时就需要创建类；类把需要的变量和函数结合在一起，这种包含也称为封装。

Python类的结构

class 类名: … 成员变量 … 成员函数(至少有一个形参self) …

简单的例子

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11


>>> class Test:
...     first = 123
...     second = 456
...     def f(self):
...         return 'test'
...
>>> dog = Test()
>>> dog.f()
'test'
>>> dog.first
123

对象的创建创建对象的过程称之为实例化；当一个对象被创建后，包括三方面的特征：对象的句柄、属性和方法。句柄用于区分不同的对象对象的属性和方法与类中的成员变量和成员函数对应

小例

if name == “main” … myClass1 = MyClass()

ok, python复习告一段落,下一阶段docker进阶

文章目录