python

关注公众号 jb51net

关闭
首页 > 脚本专栏 > python > Python pd.Series()函数

Python数据处理之pd.Series()函数的基本使用

作者:流年里不舍的执着

Series是带标签的一维数组,可存储整数、浮点数、字符串、Python 对象等类型的数据,轴标签统称为索引,下面这篇文章主要给大家介绍了关于Python数据处理之pd.Series()函数的基本使用,需要的朋友可以参考下

1.Series介绍

Pandas模块的数据结构主要有两种:1.Series 2.DataFrame

Series 是一维数组,基于Numpy的ndarray 结构

Series([data, index, dtype, name, copy, …])    
# One-dimensional ndarray with axis labels (including time series).

2.Series创建

import Pandas as pd 
import numpy as np

1.pd.Series([list],index=[list])

参数为list ,index为可选参数,若不填写则默认为index从0开始

obj = pd.Series([4, 7, -5, 3, 7, np.nan])
obj

输出结果为:

0    4.0
1    7.0
2   -5.0
3    3.0
4    7.0
5    NaN
dtype: float64

2.pd.Series(np.arange())

arr = np.arange(6)
s = pd.Series(arr)
s

输出结果为:

0    0
1    1
2    2
3    3
4    4
5    5
dtype: int32

pd.Series({dict})
d = {'a':10,'b':20,'c':30,'d':40,'e':50}
s = pd.Series(d)
s

输出结果为:

a    10
b    20
c    30
d    40
e    50
dtype: int64

可以通过DataFrame中某一行或者某一列创建序列

3 Series基本属性

obj.values
# array([ 4.,  7., -5.,  3.,  7., nan])
obj.index
# RangeIndex(start=0, stop=6, step=1)

4 索引

5 计算、描述性统计

 Series.value_counts:Return a Series containing counts of unique values.

index = ['Bob', 'Steve', 'Jeff', 'Ryan', 'Jeff', 'Ryan'] 
obj = pd.Series([4, 7, -5, 3, 7, np.nan],index = index)
obj.value_counts()

输出结果为:

 7.0    2
 3.0    1
-5.0    1
 4.0    1
dtype: int64

6 排序

Series.sort_values

Series.sort_values(self, axis=0, ascending=True, inplace=False, kind='quicksort', na_position='last')

Parameters:

ParametersDescription
axis{0 or ‘index’}, default 0,Axis to direct sorting. The value ‘index’ is accepted for compatibility with DataFrame.sort_values.
ascendinbool, default True,If True, sort values in ascending order, otherwise descending.
inplacebool, default FalseIf True, perform operation in-place.
kind{‘quicksort’, ‘mergesort’ or ‘heapsort’}, default ‘quicksort’Choice of sorting algorithm. See also numpy.sort() for more information. ‘mergesort’ is the only stable algorithm.
na_position{‘first’ or ‘last’}, default ‘last’,Argument ‘first’ puts NaNs at the beginning, ‘last’ puts NaNs at the end.

Returns:

Series:Series ordered by values.

obj.sort_values()

输出结果为:

Jeff    -5.0
Ryan     3.0
Bob      4.0
Steve    7.0
Jeff     7.0
Ryan     NaN
dtype: float64

Series.rank(self, axis=0, method='average', numeric_only=None, na_option='keep', ascending=True, pct=False)[source]

Parameters:

ParametersDescription
axis{0 or ‘index’, 1 or ‘columns’}, default 0Index to direct ranking.
method{‘average’, ‘min’, ‘max’, ‘first’, ‘dense’}, default ‘average’How to rank the group of records that have the same value (i.e. ties): average, average rank of the group; min: lowest rank in the group; max: highest rank in the group; first: ranks assigned in order they appear in the array; dense: like ‘min’, but rank always increases by 1,between groups
numeric_onlybool, optional,For DataFrame objects, rank only numeric columns if set to True.
na_option{‘keep’, ‘top’, ‘bottom’}, default ‘keep’, How to rank NaN values:;keep: assign NaN rank to NaN values; top: assign smallest rank to NaN values if ascending; bottom: assign highest rank to NaN values if ascending
ascendingbool, default True Whether or not the elements should be ranked in ascending order.
pctbool, default False Whether or not to display the returned rankings in percentile form.

Returns:

same type as caller :Return a Series or DataFrame with data ranks as values.

# obj.rank()            #从大到小排,NaN还是NaN
obj.rank(method='dense')  
# obj.rank(method='min')
# obj.rank(method='max')
# obj.rank(method='first')
# obj.rank(method='dense')

输出结果为:

Bob      3.0
Steve    4.0
Jeff     1.0
Ryan     2.0
Jeff     4.0
Ryan     NaN
dtype: float64

总结

到此这篇关于Python数据处理之pd.Series()函数的基本使用的文章就介绍到这了,更多相关Python pd.Series()函数内容请搜索脚本之家以前的文章或继续浏览下面的相关文章希望大家以后多多支持脚本之家!

您可能感兴趣的文章:
阅读全文