Python массивы. библиотека numpy

Алан-э-Дейл       29.03.2024 г.

NumPy – Using Matplotlib

NumPy has a numpy.histogram() function that is a graphical representation of the frequency distribution of data. Rectangles of equal horizontal size corresponding to class interval called bin and variable height corresponding to frequency.

numpy.histogram()

The numpy.histogram() function takes the input array and bins as two parameters. The successive elements in bin array act as the boundary of each bin.

import numpy as np 

a = np.array() 

np.histogram(a,bins = ) 

hist,bins = np.histogram(a,bins = ) 

print hist 

print bins 

It will produce the following output −

plt()

Matplotlib can convert this numeric representation of histogram into a graph. The plt() function of pyplot submodule takes the array containing the data and bin array as parameters and converts into a histogram.

from matplotlib import pyplot as plt 

import numpy as np  

a = np.array() 

plt.hist(a, bins = ) 

plt.title(“histogram”) 

plt.show()

It should produce the following output –

I/O with NumPy

The ndarray objects can be saved to and loaded from the disk files. The IO functions available are −

  • load() and save() functions handle /numPy binary files (with npy extension)
  • loadtxt() and savetxt() functions handle normal text files

NumPy introduces a simple file format for ndarray objects. This .npy file stores data, shape, dtype and other information required to reconstruct the ndarray in a disk file such that the array is correctly retrieved even if the file is on another machine with different architecture.

N-Dimensional NumPy Arrays

This doesn’t happen extremely often, but there are cases when you’ll want to deal with arrays that have greater than dimensions. One way to think of this is as a list of lists of lists. Let’s say we want to store the monthly earnings of a store, but we want to be able to quickly lookup the results for a quarter, and for a year. The earnings for one year might look like this:

The store earned in January, in February, and so on. We can split up these earnings by quarter into a list of lists:

We can retrieve the earnings from January by calling . If we want the results for a whole quarter, we can call or . We now have a 2-dimensional array, or matrix. But what if we now want to add the results from another year? We have to add a third dimension:

We can retrieve the earnings from January of the first year by calling . We now need three indexes to retrieve a single element. A three-dimensional array in NumPy is much the same. In fact, we can convert to an array and then get the earnings for January of the first year:

We can also find the shape of the array:

Indexing and slicing work the exact same way with a 3-dimensional array, but now we have an extra axis to pass in. If we wanted to get the earnings for January of all years, we could do this:

If we wanted to get first quarter earnings from both years, we could do this:

Adding more dimensions can make it much easier to query your data if it’s organized in a certain way. As we go from 3-dimensional arrays to 4-dimensional and larger arrays, the same properties apply, and they can be indexed and sliced in the same ways.

NumPy Data Types

As we mentioned earlier, each NumPy array can store elements of a single data type. For example, contains only float values. NumPy stores values using its own data types, which are distinct from Python types like and . This is because the core of NumPy is written in a programming language called C, which stores data differently than the Python data types. NumPy data types map between Python and C, allowing us to use NumPy arrays without any conversion hitches.

You can find the data type of a NumPy array by accessing the dtype property:

NumPy has several different data types, which mostly map to Python data types, like , and . You can find a full listing of NumPy data types here, but here are a few important ones:

  • — numeric floating point data.
  • — integer data.
  • — character data.
  • — Python objects.

Data types additionally end with a suffix that indicates how many bits of memory they take up. So is a 32 bit integer data type, and is a bit float data type.

Converting Data Types

You can use the numpy.ndarray.astype method to convert an array to a different type. The method will actually copy the array, and return a new array with the specified data type. For instance, we can convert to the data type:

As you can see above, all of the items in the resulting array are integers. Note that we used the Python type instead of a NumPy data type when converting . This is because several Python data types, including , , and , can be used with NumPy, and are automatically converted to NumPy data types.

We can check the property of the of the resulting array to see what data type NumPy mapped the resulting array to:

The array has been converted to a 64-bit integer data type. This allows for very long integer values, but takes up more space in memory than storing the values as 32-bit integers.

If you want more control over how the array is stored in memory, you can directly create NumPy dtype objects like numpy.int32:

You can use these directly to convert between types:

NumPy Array Operations

NumPy makes it simple to perform mathematical operations on arrays. This is one of the primary advantages of NumPy, and makes it quite easy to do computations.

Single Array Math

If you do any of the basic mathematical operations (, , , , ) with an array and a value, it will apply the operation to each of the elements in the array.

Let’s say we want to add points to each quality score because we’re drunk and feeling generous. Here’s how we’d do that:

Note that the above operation won’t change the array — it will return a new 1-dimensional array where has been added to each element in the quality column of wines.

If we instead did , we’d modify the array in place:

All the other operations work the same way. For example, if we want to multiply each of the quality score by , we could do it like this:

Multiple Array Math

It’s also possible to do mathematical operations between arrays. This will apply the operation to pairs of elements. For example, if we add the column to itself, here’s what we get:

Note that this is equivalent to — this is because NumPy adds each pair of elements. The first element in the first array is added to the first element in the second array, the second to the second, and so on.

We can also use this to multiply arrays. Let’s say we want to pick a wine that maximizes alcohol content and quality (we want to get drunk, but we’re classy). We’d multiply by , and select the wine with the highest score:

All of the common operations (, , , , ) will work between arrays.

Broadcasting

Unless the arrays that you’re operating on are the exact same size, it’s not possible to do elementwise operations. In cases like this, NumPy performs broadcasting to try to match up elements. Essentially, broadcasting involves a few steps:

  • The last dimension of each array is compared.
    • If the dimension lengths are equal, or one of the dimensions is of length , then we keep going.
    • If the dimension lengths aren’t equal, and none of the dimensions have length , then there’s an error.
  • Continue checking dimensions until the shortest array is out of dimensions.

For example, the following two shapes are compatible:

This is because the length of the trailing dimension of array is , and the length of the trailing dimension of array is . They’re equal, so that dimension is okay. Array is then out of elements, so we’re okay, and the arrays are compatible for mathematical operations.

The following two shapes are also compatible:

The last dimension matches, and is of length in the first dimension.

These two arrays don’t match:

The lengths of the dimensions aren’t equal, and neither array has either dimension length equal to .

There’s a detailed explanation of broadcasting here, but we’ll go through a few examples to illustrate the principle:

The above example didn’t work because the two arrays don’t have a matching trailing dimension. Here’s an example where the last dimension does match:

As you can see, has been broadcasted across each row of . Here’s an example with our data:

Elements of are broadcast over each row of , so the first column of has the first value in added to it, and so on.

Numpy dot() product

This product is a scalar multiplication of each element of the given array. In general mathematical terms, a dot product between two vectors is the product between their respective scalar components and the cosine of the angle between them. So, if we say a and b are the two vectors at a specific angle Θ, then 

a.b = |a|.|b|.cosΘ # general equation of the dot product for two vectors

But, in the dot() function of the Numpy array, there is no place for the angle Θ. So, we just need to give two matrices or arrays as parameters. Thus, we shall implement this in a code:

import numpy as np

var_1, var_2 = 34, 45 # for scalar values
dot_product_1 = np.dot(var_1, var_2)
dot_product_1

# for matrices
a = np.array(, , ])
b = np.array(, , ])


dot_product_2 = np.dot(a, b)
dot_product_2

Output:

Output for the mathematical calculations

Code explanation:

  1. Import the module Numpy.
  2. After that declare two variables var_1 and var_2.
  3. Call the np.dot() function and input all those variables inside it. Store all inside a dot_product_1 variable.
  4. Then print it one the screen.
  5. For multidimensional arrays create arrays using the array() method of numpy. Then following the same above procedure call the dot() product. Then print it on the screen.

NumPy – Mathematical Functions

Quite understandably, NumPy contains a large number of various mathematical operations. NumPy provides standard trigonometric functions, functions for arithmetic operations, handling complex numbers, etc.

Trigonometric Functions

NumPy has standard trigonometric functions which return trigonometric ratios for a given angle in radians.

Example

import numpy as np 

a = np.array() 

print ‘Sine of different angles:’ 

# Convert to radians by multiplying with pi/180 

print np.sin(a*np.pi/180) 

print ‘\n’  

print ‘Cosine values for angles in array:’ 

print np.cos(a*np.pi/180) 

print ‘\n’  

print ‘Tangent values for given angles:’ 

print np.tan(a*np.pi/180) 

Here is its output −

Sine of different angles:

Cosine values for angles in array:

[  1.00000000e+00   8.66025404e-01   7.07106781e-01   5.00000000e-01

   6.12323400e-17]                                                            

Tangent values for given angles:

[  0.00000000e+00   5.77350269e-01   1.00000000e+00   1.73205081e+00

   1.63312394e+16]

arcsin, arcos, and arctan functions return the trigonometric inverse of sin, cos, and tan of the given angle. The result of these functions can be verified by numpy.degrees() function by converting radians to degrees.

Functions for Rounding

numpy.around()

This is a function that returns the value rounded to the desired precision. The function takes the following parameters.

numpy.around(a,decimals)

Where, 

Sr.No. Parameter & Description
1 aInput data
2 decimalsThe number of decimals to round to. Default is 0. If negative, the integer is rounded to position to the left of the decimal point

NumPy – Statistical Functions

NumPy has quite a few useful statistical functions for finding minimum, maximum, percentile standard deviation and variance, etc. from the given elements in the array. The functions are explained as follows −

numpy.amin() and numpy.amax()numpy.amin() and numpy.amax()

These functions return the minimum and the maximum from the elements in the given array along the specified axis.

Example

import numpy as np 

a = np.array(,,]) 

print ‘Our array is:’ 

print a  

print ‘\n’  

print ‘Applying amin() function:’ 

print np.amin(a,1) 

print ‘\n’  

print ‘Applying amin() function again:’ 

print np.amin(a,0) 

print ‘\n’  

print ‘Applying amax() function:’ 

print np.amax(a) 

print ‘\n’

print ‘Applying amax() function again:’ 

print np.amax(a, axis = 0)

It will produce the following output −

Our array is:

]

Applying amin() function:

Applying amin() function again:

Applying amax() function:

9

Applying amax() function again:

numpy.ptp()

The numpy.ptp() function returns the range (maximum-minimum) of values along an axis.

import numpy as np 

a = np.array(,,]) 

print ‘Our array is:’ 

print a 

print ‘\n’  

print ‘Applying ptp() function:’ 

print np.ptp(a) 

print ‘\n’  

print ‘Applying ptp() function along axis 1:’ 

print np.ptp(a, axis = 1) 

print ‘\n’   

print ‘Applying ptp() function along axis 0:’

print np.ptp(a, axis = 0) 

numpy.percentile()

Percentile (or a centile) is a measure used in statistics indicating the value below which a given percentage of observations in a group of observations fall. The function numpy.percentile() takes the following arguments.

Where,

Sr.No. Argument & Description
1 aInput array
2 qThe percentile to compute must be between 0-100
3 axisThe axis along which the percentile is to be calculated

A variety of sorting related functions are available in NumPy. These sorting functions implement different sorting algorithms, each of them characterized by the speed of execution, worst-case performance, the workspace required and the stability of algorithms. Following table shows the comparison of three sorting algorithms.

kind speed worst case work space stable
‘quicksort’ 1 O(n^2) no
‘mergesort’ 2 O(n*log(n)) ~n/2 yes
‘heapsort’ 3 O(n*log(n)) no

numpy.sort()

The sort() function returns a sorted copy of the input array. It has the following parameters −

numpy.sort(a, axis, kind, order)

Where,

Sr.No. Parameter & Description
1 aArray to be sorted
2 axisThe axis along which the array is to be sorted. If none, the array is flattened, sorting on the last axis
3 kindDefault is quicksort
4 orderIf the array contains fields, the order of fields to be sorted

Предисловие

Python становится все популярнее и популярнее. Много людей начинают изучать этот язык и встречают на своё пути библиотеку NumPy (сокр. от Numeric Python, невозможно учить python и не знать про эту библиотеку). Она насчитывает в себе множество различных функций и подмодулей, потому запомнить всё просто напросто невозможно. Сегодня мы запишем примеры использования различных базовых конструкций относительно главного класса Numpy — np.ndarray.

Массивы numpy много где используются: Scipy, matplotlib, Pandas (и других библиотеках, составляющих основной инструментарий data scientist’a) и т.д., потому навык владения ими очень важен для Python разработчика.

В качестве среды разработки использован Jupyter Notebook.

NumPy – Broadcasting

The term broadcasting refers to the ability of NumPy to treat arrays of different shapes during arithmetic operations. Arithmetic operations on arrays are usually done on corresponding elements. If two arrays are of exactly the same shape, then these operations are smoothly performed.

Example 1

import numpy as np 

a = np.array() 

b = np.array() 

c = a * b 

print c

Its output is as follows −

If the dimensions of the two arrays are dissimilar, element-to-element operations are not possible. However, operations on arrays of non-similar shapes is still possible in NumPy, because of the broadcasting capability. The smaller array is broadcast to the size of the larger array so that they have compatible shapes.

Broadcasting is possible if the following rules are satisfied −

  • Array with smaller ndim than the other is prepended with ‘1’ in its shape.
  • Size in each dimension of the output shape is maximum of the input sizes in that dimension.
  • An input can be used in calculation if its size in a particular dimension matches the output size or its value is exactly 1.
  • If an input has a dimension size of 1, the first data entry in that dimension is used for all calculations along that dimension.

A set of arrays is said to be broadcastable if the above rules produce a valid result and one of the following is true −

  • Arrays have exactly the same shape.
  • Arrays have the same number of dimensions and the length of each dimension is either a common length or 1.
  • Array having too few dimensions can have its shape prepended with a dimension of length 1, so that the above stated property is true.

The following figure demonstrates how array b is broadcast to become compatible with a.

Добро пожаловать в NumPy!

NumPy (NumericalPython) — это библиотека Python с открытым исходным кодом, которая используется практически во всех областях науки и техники. Это универсальный стандарт для работы с числовыми данными в Python, и он лежит в основе научных экосистем Python и PyData. В число пользователей NumPy входят все — от начинающих программистов до опытных исследователей, занимающихся самыми современными научными и промышленными исследованиями и разработками. API-интерфейс NumPy широко используется в пакетах Pandas, SciPy, Matplotlib, scikit-learn, scikit-image и в большинстве других научных и научных пакетов Python.

Библиотека NumPy содержит многомерный массив и матричные структуры данных (дополнительную информацию об этом вы найдете в следующих разделах). Он предоставляет ndarray, однородный объект n-мерного массива, с методами для эффективной работы с ним. NumPy может использоваться для выполнения самых разнообразных математических операций над массивами. Он добавляет мощные структуры данных в Python, которые гарантируют эффективные вычисления с массивами и матрицами, и предоставляет огромную библиотеку математических функций высокого уровня, которые работают с этими массивами и матрицами.

Узнайте больше о NumPy здесь!

GIF черезgiphy

Установка NumPy

Чтобы установить NumPy, я настоятельно рекомендую использовать научный дистрибутив Python. Если вам нужны полные инструкции по установке NumPy в вашей операционной системе, вы можетенайти все детали здесь,

Если у вас уже есть Python, вы можете установить NumPy с помощью

conda install numpy

или

pip install numpy

Если у вас еще нет Python, вы можете рассмотреть возможность использованияанаконда, Это самый простой способ начать. Преимущество этого дистрибутива в том, что вам не нужно слишком беспокоиться об отдельной установке NumPy или каких-либо основных пакетов, которые вы будете использовать для анализа данных, таких как pandas, Scikit-Learn и т. Д.

Если вам нужна более подробная информация об установке, вы можете найти всю информацию об установке наscipy.org,

фотоАдриеннотPexels

Если у вас возникли проблемы с установкой Anaconda, вы можете ознакомиться с этой статьей:

Как импортировать NumPy

Каждый раз, когда вы хотите использовать пакет или библиотеку в своем коде, вам сначала нужно сделать его доступным.

Чтобы начать использовать NumPy и все функции, доступные в NumPy, вам необходимо импортировать его. Это можно легко сделать с помощью этого оператора импорта:

import numpy as np

(Мы сокращаем «numpy» до «np», чтобы сэкономить время и сохранить стандартизированный код, чтобы любой, кто работает с вашим кодом, мог легко его понять и запустить.)

В чем разница между списком Python и массивом NumPy?

NumPy предоставляет вам огромный выбор быстрых и эффективных числовых опций. Хотя список Python может содержать разные типы данных в одном списке, все элементы в массиве NumPy должны быть однородными. Математические операции, которые должны выполняться над массивами, были бы невозможны, если бы они не были однородными.

Зачем использовать NumPy?

фотоPixabayотPexels

Массивы NumPy быстрее и компактнее, чем списки Python. Массив потребляет меньше памяти и намного удобнее в использовании. NumPy использует гораздо меньше памяти для хранения данных и предоставляет механизм задания типов данных, который позволяет оптимизировать код еще дальше.

Что такое массив?

Массив является центральной структурой данных библиотеки NumPy. Это таблица значений, которая содержит информацию о необработанных данных, о том, как найти элемент и как интерпретировать элемент. Он имеет сетку элементов, которые можно проиндексировать в Все элементы имеют одинаковый тип, называемыймассив dtype(тип данных).

Массив может быть проиндексирован набором неотрицательных целых чисел, логическими значениями, другим массивом или целыми числами.рангмассива это количество измерений.формамассива — это кортеж целых чисел, дающий размер массива по каждому измерению.

Одним из способов инициализации массивов NumPy является использование вложенных списков Python.

a = np.array(, , ])

Мы можем получить доступ к элементам в массиве, используя квадратные скобки. Когда вы получаете доступ к элементам, помните, чтоиндексирование в NumPy начинается с 0, Это означает, что если вы хотите получить доступ к первому элементу в вашем массиве, вы получите доступ к элементу «0».

print(a)

Выход:

Some Other Child Modules Error

Numpy has many other child libraries which can be installed externally. All of these libraries look like a part of numpy, but they need to be installed separately. Following are some of the examples –

No module named numpy.core._multiarray_umath

This error can be resolved by using p command and upgrading your numpy version. Other libraries like TensorFlow and scikit-learn depend on new APIs inside the module, that’s why your module needs to be updated.

No module named numpy.testing.nosetester

Run the following commands in your terminal to resolve this error –

pip install numpy==1.18
pip install scipy==1.1.0
pip install scikit-learn==0.21.3

No module named numpy.distutils._msvccompiler

Use Python version 3.7 to solve this error. The newer versions 3.8 and 3.9 are currently unsupported in some of the numpy methods.

1.4.1.4. Basic visualization¶

Now that we have our first data arrays, we are going to visualize them.

Start by launching IPython:

$ ipython # or ipython3 depending on your install

Or the notebook:

$ jupyter notebook

Once IPython has started, enable interactive plots:

>>> %matplotlib  

Or, from the notebook, enable plots in the notebook:

>>> %matplotlib inline 

The is important for the notebook, so that plots are displayed in
the notebook and not in
a new window.

Matplotlib is a 2D plotting package. We can import its functions as below:

>>> import matplotlib.pyplot as plt  # the tidy way

And then use (note that you have to use explicitly if you have not enabled interactive plots with ):

>>> plt.plot(x, y)       # line plot    
>>> plt.show()           # <-- shows the plot (not needed with interactive plots) 

Or, if you have enabled interactive plots with :

>>> plt.plot(x, y)       # line plot    

1D plotting:

>>> x = np.linspace(, 3, 20)
>>> y = np.linspace(, 9, 20)
>>> plt.plot(x, y)       # line plot    

>>> plt.plot(x, y, 'o')  # dot plot    

2D arrays (such as images):

>>> image = np.random.rand(30, 30)
>>> plt.imshow(image, cmap=plt.cm.hot)    
<matplotlib.image.AxesImage object at ...>
>>> plt.colorbar()    
<matplotlib.colorbar.Colorbar object at ...>

See also

More in the:

NumPy exponential FAQ

Let’s quickly cover some frequently asked questions about the NumPy exponential function.

Frequently asked questions:

It calculates for an array of input values, where is Euler’s number: 2.71828 …

For more information, .

What’s the difference between math.exp and numpy.exp?

Essentially, the function only works on scalar values, whereas can operate on arrays of values.

So you can use math.exp on a single number:

#THIS WORKS!
import math
math.exp(2)

But you can not use math.exp on an array-like object:

#THIS THROWS AN ERROR
import math
math.exp()

And as you saw earlier in this tutorial, the np.exp function works with both scalars and arrays.

Essentially, np.exp is more flexible than math.exp.

WTF is Euler’s Number?

The short answer: Euler’s number (AKA, ) is an interesting and important mathematical constant. It’s a number!

The value of is roughly 2.7182818284590452353602874713527.

The long answer: it’s complicated.

How exactly we arrive at this constant and what it’s good for is sort of a long answer, and beyond the scope of this blog post.

At a high level though, is a very important number in mathematics. It shows up all over the place in math, physics, engineering, economics, and just about any place that deals with exponential growth, compounded growth, and calculus.

For more info, check out this Youtube video.

Further Reading

You should now have a good grasp of NumPy, and how to apply it to a data set.

If you want to dive into more depth, here are some resources that may be helpful:

  • NumPy Quickstart — has good code examples and covers most basic NumPy functionality.
  • — a great tutorial on NumPy and other Python libraries.
  • Visual NumPy Introduction — a guide that uses the game of life to illustrate NumPy concepts.

In our next tutorial, we dive more into Pandas, a library that builds on NumPy and makes data analysis even easier. It solves two of the biggest pain points which are that:

  • You can’t mix multiple data types in an array.
  • You have to remember what type of data each column contains.

Learn Python the Right Way.

Learn Python by writing Python code from day one, right in your browser window. It’s the best way to learn Python — see for yourself with one of our 60+ free lessons.

Гость форума
От: admin

Эта тема закрыта для публикации ответов.