2016年11月8日 星期二

Python_Note13

Python_pandas

Introduce to Data Structures

http://pandas.pydata.org/pandas-docs/stable/dsintro.html#intro-to-data-structures

Series

Series is a one-dimensional labeled array capable of holding any data type (integers, strings, floating point numbers, Python objects, etc.). The axis labels are collectively referred to as the index. The basic method to create a Series is to call:
s = pd.Series(data, index=index)
Here, data can be many different things:
  • a Python dict
  • an ndarray
  • a scalar value (like 5)
In [3]: s = pd.Series(np.random.randn(5), index=['a', 'b', 'c', 'd', 'e'])

In [4]: s
Out[4]: 
a   -2.7828
b    0.4264
c   -0.6505
d    1.1465
e   -0.6631
dtype: float64

In [5]: s.index
Out[5]: Index([u'a', u'b', u'c', u'd', u'e'], dtype='object')

In [6]: pd.Series(np.random.randn(5))
Out[6]: 
0    0.2939
1   -0.4049
2    1.1665
3    0.8420
4    0.5398
dtype: float64
In[87]: s = pd.Series([1,2,3], index=['a','b','c'])

In[88]: s
Out[88]:
a    1
b    2
c    3
dtype: int64

In[85]: s = pd.Series(['a','b','c'], index=[1,2,3])
In[86]: s
Out[86]:
1    a
2    b
3    c
dtype: object

In[82]: s = pd.Series([1,2,3], index=[1,2,3])
In[83]: s
Out[83]:
1    1
2    2
3    3
dtype: int64
In [16]: s['a']
Out[16]: -2.7827595933769937

In [17]: s['e'] = 12.

In [18]: s
Out[18]: 
a    -2.7828
b     0.4264
c    -0.6505
d     1.1465
e    12.0000
dtype: float64

In [19]: 'e' in s
Out[19]: True

In [20]: 'f' in s
Out[20]: False

pandas.read_csv

Read CSV (comma-separated) file into DataFrame
Also supports optionally iterating or breaking of the file into chunks.
Additional help can be found in the online docs for IO Tools.

pandas.DataFrame.dropna

DataFrame.dropna(axis=0how='any'thresh=Nonesubset=Noneinplace=False)
Return object with labels on given axis omitted where alternately any or all of the data are missing

pandas.DataFrame.iloc

Purely integer-location based indexing for selection by position.
.iloc[] is primarily integer position based (from 0 to length-1 of the axis), but may also be used with a boolean array.