當(dāng)前位置：首頁(yè) > 编程语言 > python >内容正文

python

pandas数据结构：Series/DataFrame；python函数：range/arange

發(fā)布時(shí)間：2025/7/14 python 17 豆豆

生活随笔收集整理的這篇文章主要介紹了 pandas数据结构：Series/DataFrame；python函数：range/arange 小編覺(jué)得挺不錯(cuò)的,現(xiàn)在分享給大家,幫大家做個(gè)參考.

1. Series

　　Series 是一個(gè)類數(shù)組的數(shù)據(jù)結(jié)構(gòu)，同時(shí)帶有標(biāo)簽（lable）或者說(shuō)索引（index）。

　　1.1 下邊生成一個(gè)最簡(jiǎn)單的Series對(duì)象，因?yàn)闆](méi)有給Series指定索引，所以此時(shí)會(huì)使用默認(rèn)索引(從0到N-1)。

# 引入Series和DataFrame
In [16]: from pandas import Series,DataFrame In [17]: import pandas as pdIn [18]: ser1 = Series([1,2,3,4])In [19]: ser1 Out[19]: 0 1 1 2 2 3 3 4 dtype: int64

　　1.2 當(dāng)要生成一個(gè)指定索引的Series 時(shí)候，可以這樣：　　

# 給index指定一個(gè)list
In [23]: ser2 = Series(range(4),index = ["a","b","c","d"])In [24]: ser2 Out[24]: a 0 b 1 c 2 d 3 dtype: int64

　　1.3 也可以通過(guò)字典來(lái)創(chuàng)建Series對(duì)象

In [45]: sdata = {'Ohio': 35000, 'Texas': 71000, 'Oregon': 16000, 'Utah': 5000}In [46]: ser3 = Series(sdata) # 可以發(fā)現(xiàn)，用字典創(chuàng)建的Series是按index有序的 In [47]: ser3 Out[47]: Ohio 35000 Oregon 16000 Texas 71000 Utah 5000 dtype: int64

　　在用字典生成Series的時(shí)候，也可以指定索引，當(dāng)索引中值對(duì)應(yīng)的字典中的值不存在的時(shí)候，則此索引的值標(biāo)記為Missing，NA，并且可以通過(guò)函數(shù)（pandas.isnull，pandas.notnull）來(lái)確定哪些索引對(duì)應(yīng)的值是沒(méi)有的。

In [48]: states = ['California', 'Ohio', 'Oregon', 'Texas']In [49]: ser3 = Series(sdata,index = states)In [50]: ser3 Out[50]: California NaN Ohio 35000.0 Oregon 16000.0 Texas 71000.0 dtype: float64
# 判斷哪些值為空
In [51]: pd.isnull(ser3)
Out[51]:
California???? True
Ohio????????? False
Oregon??????? False
Texas???????? False
dtype: bool

In [52]: pd.notnull(ser3)
Out[52]:
California??? False
Ohio?????????? True
Oregon???????? True
Texas????????? True
dtype: bool

　　1.4 訪問(wèn)Series中的元素和索引：

# 訪問(wèn)索引為"a"的元素
In [25]: ser2["a"] Out[25]: 0 # 訪問(wèn)索引為"a","c"的元素 In [26]: ser2[["a","c"]] Out[26]: a 0 c 2 dtype: int64 # 獲取所有的值 In [27]: ser2.values Out[27]: array([0, 1, 2, 3]) # 獲取所有的索引 In [28]: ser2.index Out[28]: Index([u'a', u'b', u'c', u'd'], dtype='object')

　　1.5 簡(jiǎn)單運(yùn)算

　　在pandas的Series中，會(huì)保留NumPy的數(shù)組操作（用布爾數(shù)組過(guò)濾數(shù)據(jù)，標(biāo)量乘法，以及使用數(shù)學(xué)函數(shù)），并同時(shí)保持引用的使用

In [34]: ser2[ser2 > 2] Out[34]: a 64 d 3 dtype: int64In [35]: ser2 * 2 Out[35]: a 128 b 2 c 4 d 6 dtype: int64In [36]: np.exp(ser2) Out[36]: a 6.235149e+27 b 2.718282e+00 c 7.389056e+00 d 2.008554e+01 dtype: float64

　　1.6 Series的自動(dòng)對(duì)齊

　　　　Series的一個(gè)重要功能就是自動(dòng)對(duì)齊（不明覺(jué)厲），看看例子就明白了。差不多就是不同Series對(duì)象運(yùn)算的時(shí)候根據(jù)其索引進(jìn)行匹配計(jì)算。

# ser3 的內(nèi)容
In [60]: ser3 Out[60]: Ohio 35000 Oregon 16000 Texas 71000 Utah 5000 dtype: int64 # ser4 的內(nèi)容 In [61]: ser4 Out[61]: California NaN Ohio 35000.0 Oregon 16000.0 Texas 71000.0 dtype: float64 # 相同索引值的元素相加 In [62]: ser3 + ser4 Out[62]: California NaN Ohio 70000.0 Oregon 32000.0 Texas 142000.0 Utah NaN dtype: float64

　　1.7 命名

　　Series對(duì)象本身，以及索引都有一個(gè) name 屬性

In [64]: ser4.index.name = "state"In [65]: ser4.name = "population"In [66]: ser4 Out[66]: state California NaN Ohio 35000.0 Oregon 16000.0 Texas 71000.0 Name: population, dtype: float64

轉(zhuǎn)自：http://www.cnblogs.com/linux-wangkun/p/5903380.html

DataFrame

用pandas中的DataFrame時(shí)選取行或列：

import numpy as np import pandas as pd from pandas import Sereis, DataFrameser = Series(np.arange(3.))data = DataFrame(np.arange(16).reshape(4,4),index=list('abcd'),columns=list('wxyz'))data['w'] #選擇表格中的'w'列，使用類字典屬性,返回的是Series類型data.w #選擇表格中的'w'列，使用點(diǎn)屬性,返回的是Series類型data[['w']] #選擇表格中的'w'列，返回的是DataFrame類型data[['w','z']] #選擇表格中的'w'、'z'列data[0:2] #返回第1行到第2行的所有行，前閉后開(kāi)，包括前不包括后data[1:2] #返回第2行，從0計(jì)，返回的是單行，通過(guò)有前后值的索引形式，#如果采用data[1]則報(bào)錯(cuò)data.ix[1:2] #返回第2行的第三種方法，返回的是DataFrame，跟data[1:2]同data['a':'b'] #利用index值進(jìn)行切片，返回的是**前閉后閉**的DataFrame, #即末端是包含的 #——————新版本pandas已舍棄該方法，用iloc代替——————— data.irow(0) #取data的第一行 data.icol(0) #取data的第一列ser.iget_value(0) #選取ser序列中的第一個(gè) ser.iget_value(-1) #選取ser序列中的最后一個(gè)，這種軸索引包含索引器的series不能采用ser[-1]去獲取最后一個(gè)，這會(huì)引起歧義。 #————————————————————————————-----------------data.head() #返回data的前幾行數(shù)據(jù)，默認(rèn)為前五行，需要前十行則data.head(10) data.tail() #返回data的后幾行數(shù)據(jù)，默認(rèn)為后五行，需要后十行則data.tail(10)data.iloc[-1] #選取DataFrame最后一行，返回的是Series data.iloc[-1:] #選取DataFrame最后一行，返回的是DataFramedata.loc['a',['w','x']] #返回‘a(chǎn)’行'w'、'x'列，這種用于選取行索引列索引已知data.iat[1,1] #選取第二行第二列，用于已知行、列位置的選取。

　轉(zhuǎn)自：https://blog.csdn.net/xiaodongxiexie/article/details/53108959

DataFrame的排序

原來(lái)的方法sort/sort_index都已經(jīng)過(guò)時(shí)，調(diào)用時(shí)會(huì)報(bào)錯(cuò)：

sort方法就直接找不到。

應(yīng)該調(diào)用sort_values方法來(lái)進(jìn)行排序：

Python 中的range,以及numpy包中的arange函數(shù)

range()函數(shù)

函數(shù)說(shuō)明：?range(start, stop[, step]) -> range object，根據(jù)start與stop指定的范圍以及step設(shè)定的步長(zhǎng)，生成一個(gè)序列。
參數(shù)含義：start:計(jì)數(shù)從start開(kāi)始。默認(rèn)是從0開(kāi)始。例如range（5）等價(jià)于range（0， 5）;
? ? ? ? ? ? ? end:技術(shù)到end結(jié)束，但不包括end.例如：range（0， 5）是[0, 1, 2, 3, 4]沒(méi)有5
? ? ? ? ? ? ? scan：每次跳躍的間距，默認(rèn)為1。例如：range（0， 5）等價(jià)于 range(0, 5, 1)
函數(shù)返回的是一個(gè)range object
例子：

>>> range(0,5) #生成一個(gè)range object,而不是[0,1,2,3,4] range(0, 5) >>> c = [i for i in range(0,5)] #從0 開(kāi)始到4，不包括5，默認(rèn)的間隔為1 >>> c [0, 1, 2, 3, 4] >>> c = [i for i in range(0,5,2)] #間隔設(shè)為2 >>> c [0, 2, 4]

若需要生成[ 0. ? 0.1 ?0.2 ?0.3 ?0.4 ?0.5 ?0.6 ?0.7 ?0.8 ?0.9]

>>> range(0,1,0.1) #range中的setp 不能使float Traceback (most recent call last): File ”<pyshell#5>”, line 1, in <module> range(0,1,0.1) TypeError: ’float’ object cannot be interpreted as an integer

arrange()函數(shù)

函數(shù)說(shuō)明：arange([start,] stop[, step,], dtype=None)根據(jù)start與stop指定的范圍以及step設(shè)定的步長(zhǎng)，生成一個(gè)?ndarray。?dtype : dtype
? ? ? ? The type of the output array. ?If `dtype` is not given, infer the data
? ? ? ? type from the other input arguments.

>>> np.arange(3) array([0, 1, 2]) >>> np.arange(3.0) array([ 0., 1., 2.]) >>> np.arange(3,7) array([3, 4, 5, 6]) >>> np.arange(3,7,2) array([3, 5]) >>> arange(0,1,0.1) array([ 0. , 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9])

轉(zhuǎn)自：http://blog.csdn.net/qianwenhong/article/details/41414809

Python?中的range,以及numpy包中的arange函數(shù)

range()函數(shù)

函數(shù)說(shuō)明：?range(start, stop[, step]) -> range object，根據(jù)start與stop指定的范圍以及step設(shè)定的步長(zhǎng)，生成一個(gè)序列。
參數(shù)含義：start:計(jì)數(shù)從start開(kāi)始。默認(rèn)是從0開(kāi)始。例如range（5）等價(jià)于range（0， 5）;
? ? ? ? ? ? ? end:技術(shù)到end結(jié)束，但不包括end.例如：range（0， 5）是[0, 1, 2, 3, 4]沒(méi)有5
? ? ? ? ? ? ? scan：每次跳躍的間距，默認(rèn)為1。例如：range（0， 5）等價(jià)于 range(0, 5, 1)
函數(shù)返回的是一個(gè)range object
例子：
[python]?view plaincopy
>>>?range(0,5)?????????????????#生成一個(gè)range?object,而不是[0,1,2,3,4]???
range(0,?5)?????
>>>?c?=?[i?for?i?in?range(0,5)]?????#從0?開(kāi)始到4，不包括5，默認(rèn)的間隔為1??
>>>?c??
[0,?1,?2,?3,?4]??
>>>?c?=?[i?for?i?in?range(0,5,2)]???#間隔設(shè)為2??
>>>?c??
[0,?2,?4]??

若需要生成[ 0. ? 0.1 ?0.2 ?0.3 ?0.4 ?0.5 ?0.6 ?0.7 ?0.8 ?0.9]
[python]?view plaincopy
>>>?range(0,1,0.1)????#range中的setp?不能使float??
Traceback?(most?recent?call?last):??
??File?”<pyshell#5>”,?line?1,?in?<module>??
????range(0,1,0.1)??
TypeError:?’float’?object?cannot?be?interpreted?as?an?integer??

arrange()函數(shù)

函數(shù)說(shuō)明：arange([start,] stop[, step,], dtype=None)根據(jù)start與stop指定的范圍以及step設(shè)定的步長(zhǎng)，生成一個(gè)?ndarray。?dtype : dtype
? ? ? ? The type of the output array. ?If `dtype` is not given, infer the data
? ? ? ? type from the other input arguments. [python]?view plaincopy
>>>?np.arange(3)??
??array([0,?1,?2])??
??>>>?np.arange(3.0)??
??array([?0.,??1.,??2.])??
??>>>?np.arange(3,7)??
??array([3,?4,?5,?6])??
??>>>?np.arange(3,7,2)??
??array([3,?5])??

[python]?view plaincopy
>>>?arange(0,1,0.1)??
array([?0.?,??0.1,??0.2,??0.3,??0.4,??0.5,??0.6,??0.7,??0.8,??0.9])??

轉(zhuǎn)載于:https://www.cnblogs.com/xianhan/p/9429849.html

總結(jié)

以上是生活随笔為你收集整理的pandas数据结构：Series/DataFrame；python函数：range/arange的全部?jī)?nèi)容，希望文章能夠幫你解決所遇到的問(wèn)題。

如果覺(jué)得生活随笔網(wǎng)站內(nèi)容還不錯(cuò)，歡迎將生活随笔推薦給好友。

上一篇：深入理解Java虚拟机04--类结构文件
下一篇：用掘金－Markdown 官方语法总结大