#### 联系方式

• QQ：99515681
• 邮箱：99515681@qq.com
• 工作时间：8:00-23:00
• 微信：codinghelp2

#### 您当前位置：首页 >> Python编程Python编程

###### 日期：2021-06-08 10:09

School of Engineering

Laboratory Session 2

Course : Diploma in Engineering with Business

Module : EGM271 Statistics & Data Analytics

Title : The NumPy Library (Part 1)

What is NumPy?

NumPy is a basic package for scientific computing with Python and especially for data

analysis. In fact, this library is the basis of a large amount of mathematical and scientific

Python packages, and among them, as you will see later in the book, the pandas library.

This library, specialized for data analysis, is fully developed using the concepts introduced

by NumPy. In fact, the built-in tools provided by the standard Python library could be too

simple or inadequate for most of the calculations in data analysis. Having knowledge of

the NumPy library is important to being able to use all scientific Python packages, and

particularly, to use and understand the pandas library.

Note that throughout all the lab sheets in this module, we are using Spyder:

1. Lines with “In:” are the codes you type in the console. You can run them

simply by pressing “Enter”.

2. Lines with “Out:” are the output shown in the console.

3. Lines without “In” or “Out” are the codes you type in the editor. You need to

click the Run button to run it.

To start using Numpy, type this in your Spyder editor and run the program:

import numpy as np

Ndarray: The Heart of the Library

The NumPy library is based on one main object: ndarray (which stands for N-dimensional

array). This object is a multidimensional homogeneous array with a predetermined

number of items:

? Homogeneous because virtually all the items in it are of the same type and the

same size. In fact, the data type is specified by another NumPy object called dtype

(data-type); each ndarray is associated with only one type of dtype.

? The number of the dimensions and items in an array is defined by its shape, a

tuple of N-positive integers that specifies the size for each dimension. The

dimensions are defined as axes and the number of axes as rank.

? Another peculiarity of NumPy arrays is that their size is fixed, that is, once you

define their size at the time of creation, it remains unchanged. This behaviour is

different from Python lists, which can grow or shrink in size.

The easiest way to define a new ndarray is to use the array() function, passing a Python

list containing the elements to be included in it as an argument.

In: a = np.array([1, 2, 3])

In: a

Out: array([1, 2, 3])

You can easily check that a newly created object is an ndarray by passing the new

variable to the type() function.

In: type(a)

Out: <type 'numpy.ndarray'>

You can also refer to the Variable Explorer for information about your array:

The name is a, data type is int32. The Size (3, ) indicates that it is of rank 1 (1 row), and

size 3 (3 columns).

What you have just seen is the simplest case of a one-dimensional array. But the use of

arrays can be easily extended to several dimensions. For example, if you define a twodimensional

array 2x2:

In: b = np.array([[1.3, 2.4],[0.3, 4.1]])

This array has rank 2, since it has two rows, each of length 2.

Types of Data

So far you have seen only simple integer and float numeric values, but NumPy arrays are

designed to contain a wide variety of data types. For example, you can use the data type

string:

In: g = np.array([['a', 'b'],['c', 'd']])

In: g

Out: array([['a', 'b'],['c', 'd']], dtype='<U1')

You can search in the Internet for the types of data supported by NumPy, but in this

module, most of the time we only use integer, float and string.

The dtype Option

Each ndarray object is associated with a dtype object that uniquely defines the type of

data that will occupy each item in the array. By default, the array() function can associate

the most suitable type according to the values contained in the sequence of lists or tuples.

You can explicitly define the data type using the dtype option as argument of the function.

For example, if you want to define an array with complex values, you can use the dtype

option as follows:

In: f = np.array([[1, 2, 3],[4, 5, 6]], dtype=complex)

In: f

Out: array([[1.+0.j, 2.+0.j, 3.+0.j],

[4.+0.j, 5.+0.j, 6.+0.j]])

Intrinsic Creation of an Array

The NumPy library provides a set of functions that generate ndarrays with initial content,

created with different values depending on the function. They allow a single line of code

to generate large amounts of data.

The zeros() function, for example, creates a full array of zeros with dimensions defined

by the shape argument. For example, to create a two-dimensional array 3x3, you can use:

In: np.zeros((3, 3))

Out: array([[0., 0., 0.],

[0., 0., 0.],

[0., 0., 0.]])

While the ones() function creates an array full of ones in a very similar way.

In: np.ones((3, 3))

Out: array([[1., 1., 1.],

[1., 1., 1.],

[1., 1., 1.]])

A feature that will be particularly useful is arange(). This function generates NumPy arrays

with numerical sequences that respond to particular rules depending on the passed

arguments. Some examples are shown below:

In: np.arange(0, 10)

Out: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

In: np.arange(4, 10)

Out: array([4, 5, 6, 7, 8, 9])

In: np.arange(0, 12, 3)

Out: array([0, 3, 6, 9])

In: np.arange(0, 6, 0.6)

Out: array([0. , 0.6, 1.2, 1.8, 2.4, 3. , 3.6, 4.2, 4.8, 5.4])

By looking at the examples above, can you explain what does each of the arguments in

arange() mean?

To generate two-dimensional arrays you can still continue to use the arange() function

but combined with the reshape() function. This function divides a linear array in different

parts in the manner specified by the shape argument. For example, to generate a 3x4 2D

array:

In: np.arange(0, 12).reshape(3, 4)

Out: array([[ 0, 1, 2, 3],

[ 4, 5, 6, 7],

[ 8, 9, 10, 11]])

Another function very similar to arange() is linspace(). This function still takes as its first

two arguments the initial and end values of the sequence, but the third argument, instead

of specifying the distance between one element and the next, defines the number of

elements into which we want the interval to be split.

In: np.linspace(0,10,5)

Out: array([ 0. , 2.5, 5. , 7.5, 10. ])

Finally, another method to obtain arrays already containing values is to fill them with

random values. This is possible using the random() function of the numpy.random module.

This function will generate an array with many elements as specified in the argument.

In: np.random.random(3)

Out: array([0.63016916, 0.65787052, 0.15795233])

In: np.random.random((3,3))

Out:

array([[0.32913055, 0.84398299, 0.97548948],

[0.46833406, 0.63764554, 0.38420612],

[0.4752704 , 0.90156781, 0.57395037]])

Indexing

Array indexing always uses square brackets ([ ]) to index the elements of the array so

that the elements can then be referred individually for various, uses such as extracting a

value, selecting items, or even assigning a new value. When you create a new array, an

appropriate scale index is also automatically created (see Fig. 1).

Fig. 1: Indexing a 1d ndarray

In order to access a single element of an array, you can refer to its index.

In: a = np.arange(10, 16)

In: a

Out: array([10, 11, 12, 13, 14, 15])

In: a[4]

Out: 14

In: a[-1]

Out: 15

In: a[0]

Out: 10

In: a[-6]

Out: 10

To select multiple items at once, you can pass array of indexes in square brackets.

In: a[[1, 3, 4]]

Out: array([11, 13, 14])

Moving on to the two-dimensional case, namely the matrices, they are represented as

rectangular arrays consisting of rows and columns. Indexing in this case is represented

by a pair of values: the first value is the index of the row and the second is the index of

the column. Therefore, if you want to access the values or select elements in the matrix,

you will still use square brackets, but this time there are two values [row index, column

index] (see Fig. 2).

Fig. 2: Indexing a 2d array

In: A = np.arange(10, 19).reshape((3, 3))

In: A

Out: array([[10, 11, 12],

[13, 14, 15],

[16, 17, 18]])

In: A[1, 2]

Out: 15

Slicing

Slicing allows you to extract portions of an array to generate new arrays. Depending on

the portion of the array that you want to extract, you must use the slice syntax; that is,

you will use a sequence of numbers separated by colons (:) within square brackets. If you

want to extract a portion of the array, for example one that goes from the second to the

sixth element, you have to insert the index of the starting element, that is 1, and the index

of the final element, that is 5, separated by (:).

In: a = np.arange(10, 16)

In: a

Out: array([10, 11, 12, 13, 14, 15])

In: a[1:5]

Out: array([11, 12, 13, 14])

You can use a third number that defines the gap in the sequence of the elements. For

example, with a value of 2, the array will take the elements in an alternating fashion.

In: a[1:5:2]

Out: array([11, 13])

If you omit the first number, NumPy implicitly interprets this number as 0 (i.e., the initial

element of the array). If you omit the second number, this will be interpreted as the

maximum index of the array; and if you omit the last number this will be interpreted as 1.

All the elements will be considered without intervals.

In: a[::2]

Out: array([10, 12, 14])

In: a[:5:2]

Out: array([10, 12, 14])

In: a[:5:]

Out: array([10, 11, 12, 13, 14])

In the case of a two-dimensional array, the slicing syntax still applies, but it is separately

defined for the rows and columns. For example, if you want to extract only the first row:

In: A = np.arange(10, 19).reshape((3, 3))

In: A

Out: array([[10, 11, 12],

[13, 14, 15],

[16, 17, 18]])

In: A[0,:]

Out: array([10, 11, 12])

As you can see in the second index, if you leave only the colon without defining a number,

you will select all the columns. Instead, if you want to extract all the values of the first

column, you have to write the inverse.

In: A[:,0]

Out: array([10, 13, 16])

Instead, if you want to extract a smaller matrix, you need to explicitly define all intervals

with indexes that define them.

In: A[0:2, 0:2]

Out: array([[10, 11],

[13, 14]])

If the indexes of the rows or columns to be extracted are not contiguous, you can specify

an array of indexes.

In: A[[0,2], 0:2]

Out: array([[10, 11],

[16, 17]])

Iterating an Array

To iterate the elements in a 1d array, we just need to use the for construct:

import numpy as np

arr = np.array([1, 2, 3])

for x in arr:

print(x)

To iterate the elements in a 2d array, we use the nested for construct:

import numpy as np

arr = np.array([[1, 2, 3], [4, 5, 6]])

for x in arr:

for y in x:

print(y)

If you want to launch an aggregate function that returns a value calculated for every single

column or on every single row, there is an optimal way that leaves it to NumPy to manage

the iteration: the apply_along_axis() function. This function takes three arguments: the

aggregate function, the axis on which to apply the iteration, and the array. If the option

axis equals 0, then the iteration evaluates the elements column by column, whereas if

axis equals 1 then the iteration evaluates the elements row by row. For example, you can

calculate the average values first by column and then by row:

In: A = np.arange(10, 19).reshape((3, 3))

In: A

Out: array([[10, 11, 12],

[13, 14, 15],

[16, 17, 18]])

In: np.apply_along_axis(np.mean, axis=0, arr=A)

Out: array([13., 14., 15.])

In: np.apply_along_axis(np.mean, axis=1, arr=A)

Out: array([11., 14., 17.])

Instead of using the built-in NumPy functions, like np.mean, apply_along_axis() function

also accept user-defined functions.

In: def foo(x):

return x/2

In: np.apply_along_axis(foo, axis=1, arr=A)

Out: array([[5. , 5.5, 6. ],

[6.5, 7. , 7.5],

[8. , 8.5, 9. ]])

Exercise

a) Write a NumPy program to convert a list of numeric value into a one-dimensional

NumPy array.

Expected Output:

Out: Original List: [12.23, 13.32, 100, 36.32]

Out: One-dimensional NumPy array: [ 12.23 13.32 100. 36.32]

b) Write a NumPy program to create a 3x3 matrix with values ranging from 2 to 10.

c) Write a NumPy program to create a null vector of size 10 and update sixth value

to 11.

d) Write a NumPy program to create an array with values ranging from 12 to 37, then

reverse the array (first element becomes last).

Expected Output:

Original array:

[12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32

33 34 35 36 37]

Reverse array:

[37 36 35 34 33 32 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17

16 15 14 13 12]

e) Write a NumPy program to create a 5x5 array with 1 on the border and 0 inside.

Expected Output:

Original array:

[[1. 1. 1. 1. 1.]

[1. 1. 1. 1. 1.]

[1. 1. 1. 1. 1.]

[1. 1. 1. 1. 1.]

[1. 1. 1. 1. 1.]]

1 on the border and 0 inside in the array

[[1. 1. 1. 1. 1.]

[1. 0. 0. 0. 0.]

[1. 0. 0. 0. 0.]

[1. 0. 0. 0. 0.]

[1. 0. 0. 0. 0.]]

f) Write a NumPy program to append values to the end of an array.

Expected Output:

Original array:

[10, 20, 30]

After append values to the end of the array:

[10 20 30 40 50 60 70 80 90]

g) Create a 5X2 integer array from a range between 100 to 200 such that the

difference between each element is 10.

Expected Output:

Creating 5X2 array using np.arange

[[100 110]

[120 130]

[140 150]

[160 170]

[180 190]]

h) Print the array of items in the third column from all rows of the input array.

Expected Output:

Printing Input Array

[[11 22 33]

[44 55 66]

[77 88 99]]

Printing array of items in the third column from all rows

[33 66 99]

i) Return array of odd rows and even columns from below NumPy array.

Expected Output:

Printing Input Array

[[ 3 6 9 12]

[15 18 21 24]

[27 30 33 36]

[39 42 45 48]

[51 54 57 60]]

Printing array of odd rows and even columns

[[ 6 12]

[30 36]

[54 60]]

j) Create an 8X3 integer array from a range between 10 to 34 such that the difference

between each element is 1 and then Split the array into four equal-sized sub-arrays.

(Hint: Make use of the function, np.split)

Expected Output:

Creating 8X3 array using numpy.arange

[[10 11 12]

[13 14 15]

[16 17 18]

[19 20 21]

[22 23 24]

[25 26 27]

[28 29 30]

[31 32 33]]

Dividing 8X3 array into 4 sub array

[array([[10, 11, 12],[13, 14, 15]]),

array([[16, 17, 18],[19, 20, 21]]),

array([[22, 23, 24],[25, 26, 27]]),

array([[28, 29, 30],[31, 32, 33]])]

End of Lab 2