Reading excel spreadsheets with python and xlrd

Оглавление

Openpyxl images
Pandas read_excel Function
- Important Pandas read_excel Options
Python Reading Excel Files Tutorial
Introduction
Openpyxl read multiple cells
Pivot Table in pandas
Excel Object Model
Exporting the results to Excel
Sheets
Inserting and Deleting rows and columns
Exploring the data
Используя pyexcel
Openpyxl read cell
How to Read Multiple Excel (xlsx) Files in Python
Read Cells from Multiple Rows or Columns
Hidden feature: partial read
- Why did not I see above benefit?

Openpyxl images

In the following example, we show how to insert an image into a sheet.

write_image.py

#!/usr/bin/env python

from openpyxl import Workbook
from openpyxl.drawing.image import Image

book = Workbook()
sheet = book.active

img = Image("icesid.png")
sheet = 'This is Sid'

sheet.add_image(img, 'B2')

book.save("sheet_image.xlsx")

In the example, we write an image into a sheet.

from openpyxl.drawing.image import Image

We work with the class from the
module.

img = Image("icesid.png")

A new class is created. The
image is located in the current working directory.

sheet.add_image(img, 'B2')

We add a new image with the method.

Pandas read_excel Function

The Pandas function contains numerous options for importing data. We’re going to address the essential options for converting Excel files to Pandas DataFrames. These options are extremely common and really important to understand since they can drastically change how the read_excel function behaves.

Here are a few the options you’ll want to understand when working with the read_excel function in Pandas. These options follow the format

Important Pandas read_excel Options

Argument	Description
io	A string containing the pathname of the given Excel file.
sheet_name	The Excel sheet name, or sheet number, of the data you want to import. The sheet number can be an integer where 0 is the first sheet, 1 is the second, etc. If a list of sheet names/numbers are given, then the output will be a . The default is to read all the sheets and output a dictionary of DataFrames.
header	Row number to use for the list of column labels. The default is 0, indicating that the first row is assumed to contain the column labels. If the data does not have a row of column labels, None should be used.
names	A separate input of column names. This option is None by default. This option is the equivalent of assigning a list of column names to the .
index_col	Specifies which column should be used for row indices. The default option is None, meaning that all columns are included in the data, and a range of numbers is used as the row indices.
usecols	An integer, list of integers, or string that specifies the columns to be imported into the DataFrame. The default is to import all columns. If a string is given, then Pandas uses the standard Excel format to select columns (e.g. «A:C,F,G» will import columns A, B, C, F, and G).
skiprows	The number of rows to skip at the top of the Excel sheet. Default is 0. This option is useful for skipping rows in Excel that contain explanatory information about the data below it.

Python Reading Excel Files Tutorial

Now, we will see how to read excel files in python.You might think reading excel files are arduous but seriously it is not so much difficult.So let’s start to implement it.

First of all create a new project and inside this create a python file.

Creating an Excel File

Now we have to create a excel file.It is not necessary to create an excel file, if u have file already then use this. So look how to create an excel file.

For example, I have created a file that stores book’s information.You can take your choice of example. So now our excel file is like this –

Installing Library

Now we have to install library that is used for reading excel file in python.Although some other libraries are available for reading excel files but here i am using pandas library.

The Pandas library is built on NumPy and provides easy-to-use data structures and data analysis tools for the Python programming language.
This is a very powerful and flexible library and used frequently by (aspiring) data scientists to get their data into data structures that are highly expressive for their analyses.

To install pandas library ,go to terminal and write the following code.

pip install pandas

1
2
3

pip install pandas

Now pandas is successfully installed.

installing xlrd

Now we have to install one another library xlrd.For this run the following code.

pip install xlrd

1
2
3

pip install xlrd

xlrd is a library for developers to extract data from Microsoft Excel spreadsheet files

Read Excel File

Now we will start reading excel file in python.For this we have to write following code.

import pandas as pd

file = «Books.xls»
data = pd.read_excel(file) #reading file

print(data)

1
2
3
4
5
6
7
8

importpandas aspd

file=»Books.xls»

data=pd.read_excel(file)#reading file

print(data)

What we Did ?

First we have imported pandas module.
Then initialized a variable file that stores the excel file.Notice that i am not provide the path of excel file because i kept it in the same directory, but if you will keep it in another directory then you have to provide the proper path of file.
read_excel() method is used to read the excel file in python.And then you have to pass file as an argument.
print(data) simply prints the data of excel file.

Now on running the above chunks of code we got the output as below.

Conversion of Cell Contents

Some times you want to do conversion of your cell contents from excel.So, here you can see that how it happens ?

For example, if you want to convert the author name of book Python for Beginners.Here author name is Hilary and let’s say you want to convert it as visly.So what is to be done let’s see.

import pandas as pd

file = «Books.xls»
def convert_author_cell(cell):
if cell == «Hilary»:
return ‘visly’
return cell
data = pd.read_excel(file,converters={‘Author’:convert_author_cell})

print(data)

1
2
3
4
5
6
7
8
9
10
11
12

importpandas aspd

file=»Books.xls»

defconvert_author_cell(cell)

ifcell==»Hilary»

return’visly’

returncell

data=pd.read_excel(file,converters={‘Author’convert_author_cell})

print(data)

What We Did ?

First of all you have to define a function.
inside this function, you have to check if the cell is equals to Hilary then return visly otherwise return the cell whatever you got.
Now, in read_excel() method you have to supply converters argument.
Converters argument will take basically python dictionary. And in python dictionary you can supply the name of the column that is to be converted.
Whenever it is reading Author column is gonna call to convert_author_cell function for every single cell in this column.

Now, run the code and see what happens ?

Now you can see that instead of author Hilary it is replaced to the visely.In this way you can convert the cell’s contents.

Introduction

Xlsx files are the most widely used documents in the technology field. Data Scientists uses spreadsheets more than anyone else in the world and obivously they don’t do it manually.

We will need a module called openpyxl which is used to read, create and work with .xlsx files in python. There are some other modules like xlsxwriter, xlrd, xlwt, etc., but, they don’t have methods for performing all the operations on excel files. To install the openpyxl module run the following command in command line:

pip install openpyxl

Let’s see what all operations one can perform using the openpyxl module after importing the module in our code, which is simple:

import openpyxl

Once we have imported the module in our code, all we have to do is use various methods of the module to rad, write and create .xlsx files.

Openpyxl read multiple cells

We have the following data sheet:

Figure: Items

We read the data using a range operator.

read_cells2.py

#!/usr/bin/env python

import openpyxl

book = openpyxl.load_workbook('items.xlsx')

sheet = book.active

cells = sheet

for c1, c2 in cells:
    print("{0:8} {1:8}".format(c1.value, c2.value))

In the example, we read data from two columns using a range operation.

cells = sheet

In this line, we read data from cells A1 — B6.

for c1, c2 in cells:
    print("{0:8} {1:8}".format(c1.value, c2.value))

The function is used for neat output of data
on the console.

$ ./read_cells2.py 
Items    Quantity
coins          23
chairs          3
pencils         5
bottles         8
books          30

Pivot Table in pandas

Advanced Excel users also often use pivot tables. A pivot table summarizes the data of another table by grouping the data on an index and applying operations such as sorting, summing, or averaging. You can use this feature in pandas too.

We need to first identify the column or columns that will serve as the index, and the column(s) on which the summarizing formula will be applied. Let’s start small, by choosing Year as the index column and Gross Earnings as the summarization column and creating a separate DataFrame from this data.

	Year	Gross Earnings
1916.0	NaN
1	1920.0	3000000.0
2	1925.0	NaN
3	1927.0	26435.0
4	1929.0	9950.0

We now call on this subset of data. The method takes a parameter . As mentioned, we want to use Year as the index.

	Gross Earnings
Year
1916.0	NaN
1920.0	3000000.0
1925.0	NaN
1927.0	26435.0
1929.0	1408975.0

This gave us a pivot table with grouping on Year and summarization on the sum of Gross Earnings. Notice, we didn’t need to specify Gross Earnings column explicitly as pandas automatically identified it the values on which summarization should be applied.

We can use this pivot table to create some data visualizations. We can call the method on the DataFrame to create a line plot and call the method to display the plot in the notebook.

We saw how to pivot with a single column as the index. Things will get more interesting if we can use multiple columns. Let’s create another DataFrame subset but this time we will choose the columns, Country, Language and Gross Earnings.

	Country	Language	Gross Earnings
USA	NaN	NaN
1	USA	NaN	3000000.0
2	USA	NaN	NaN
3	Germany	German	26435.0
4	Germany	German	9950.0

We will use columns Country and Language as the index for the pivot table. We will use Gross Earnings as summarization table, however, we do not need to specify this explicitly as we saw earlier.

		Gross Earnings
Country	Language
Afghanistan	Dari	1.127331e+06
Argentina	Spanish	7.230936e+06
Aruba	English	1.007614e+07
Australia	Aboriginal	6.165429e+06
Dzongkha	5.052950e+05

Let’s visualize this pivot table with a bar plot. Since there are still few hundred records in this pivot table, we will plot just a few of them.

Excel Object Model

The objects, properties, methods and events are contained in the Excel object model (I do not work with events for now, so in this article, I will be focusing on the objects, properties and methods only). The relationship between them is illustrated in the flowchart below.

Image Created by Author

One of the objects I frequently use is the Range object. Range refers to the cell(s) in Excel Worksheet. The Range object contains Methods and Properties as shown in the figure below.

Screenshot from Microsoft Docs

The documentation contains the basic syntax in Excel VBA. Fortunately, it is easy to translate the syntax in Excel VBA into Python. Before we go into examples, there are five items we need to understand:

Exporting the results to Excel

If you’re going to be working with colleagues who use Excel, saving Excel files out of pandas is important. You can export or write a pandas DataFrame to an Excel file using pandas method. Pandas uses the Python module internally for writing to Excel files. The method is called on the DataFrame we want to export.We also need to pass a filename to which this DataFrame will be written.

By default, the index is also saved to the output file. However, sometimes the index doesn’t provide any useful information. For example, the DataFrame has a numeric auto-increment index, that was not part of the original Excel data.

	Title	Year	Genres	Language	Country	Content Rating	Duration	Aspect Ratio	Budget	Gross Earnings	…	Facebook Likes – Actor 2	Facebook Likes – Actor 3	Facebook Likes – cast Total	Facebook likes – Movie	Facenumber in posters	User Votes	Reviews by Users	Reviews by Crtiics	IMDB Score	Net Earnings
Intolerance: Love’s Struggle Throughout the Ages	1916.0	Drama\|History\|War	NaN	USA	Not Rated	123.0	1.33	385907.0	NaN	…	22.0	9.0	481	691	1.0	10718	88.0	69.0	8.0	NaN
1	Over the Hill to the Poorhouse	1920.0	Crime\|Drama	NaN	USA	NaN	110.0	1.33	100000.0	3000000.0	…	2.0	0.0	4	1.0	5	1.0	1.0	4.8	2900000.0
2	The Big Parade	1925.0	Drama\|Romance\|War	NaN	USA	Not Rated	151.0	1.33	245000.0	NaN	…	12.0	6.0	108	226	0.0	4849	45.0	48.0	8.3	NaN
3	Metropolis	1927.0	Drama\|Sci-Fi	German	Germany	Not Rated	145.0	1.33	6000000.0	26435.0	…	23.0	18.0	203	12000	1.0	111841	413.0	260.0	8.3	-5973565.0
4	Pandora’s Box	1929.0	Crime\|Drama\|Romance	German	Germany	Not Rated	110.0	1.33	NaN	9950.0	…	20.0	3.0	455	926	1.0	7431	84.0	71.0	8.0	NaN

5 rows × 26 columns

You can choose to skip the index by passing along index-False.

We need to be able to make our output files look nice before we can send it out to our co-workers. We can use pandas class along with the Python module to apply the formatting.

We can do use these advanced output options by creating a object and use this object to write to the EXcel file.

We can apply customizations by calling on the workbook we are writing to. Here we are setting header format as bold.

Finally, we save the output file by calling the method on the writer object.

As an example, we saved the data with column headers set as bold. And the saved file looks like the image below.

Like this, one can use to apply various formatting to the output Excel file.

Sheets

Each workbook can have multiple sheets.

Figure: Sheets

Let’s have a workbook with these three sheets.

sheets.py

#!/usr/bin/env python

import openpyxl

book = openpyxl.load_workbook('sheets.xlsx')

print(book.get_sheet_names())

active_sheet = book.active
print(type(active_sheet))

sheet = book.get_sheet_by_name("March")
print(sheet.title)

The program works with Excel sheets.

print(book.get_sheet_names())

The method returns the names of
available sheets in a workbook.

active_sheet = book.active
print(type(active_sheet))

We get the active sheet and print its type to the terminal.

sheet = book.get_sheet_by_name("March")

We get a reference to a sheet with the method.

print(sheet.title)

The title of the retrieved sheet is printed to the terminal.

$ ./sheets.py 

<class 'openpyxl.worksheet.worksheet.Worksheet'>
March

sheets2.py

#!/usr/bin/env python

import openpyxl

book = openpyxl.load_workbook('sheets.xlsx')

book.create_sheet("April")

print(book.sheetnames)

sheet1 = book.get_sheet_by_name("January")
book.remove_sheet(sheet1)

print(book.sheetnames)

book.create_sheet("January", 0)
print(book.sheetnames)

book.save('sheets2.xlsx')

In this example, we create a new sheet.

book.create_sheet("April")

A new sheet is created with the method.

print(book.sheetnames)

The sheet names can be shown with the attribute as well.

book.remove_sheet(sheet1)

A sheet can be removed with the method.

book.create_sheet("January", 0)

A new sheet can be created at the specified position; in our case,
we create a new sheet at position with index 0.

$ ./sheets2.py

It is possible to change the background colour of a worksheet.

sheets3.py

#!/usr/bin/env python

import openpyxl

book = openpyxl.load_workbook('sheets.xlsx')

sheet = book.get_sheet_by_name("March")
sheet.sheet_properties.tabColor = "0072BA"

book.save('sheets3.xlsx')

The example modifies the background colour of the sheet titled
«March».

sheet.sheet_properties.tabColor = "0072BA"

We change the property to a new colour.

Figure: Background colour of a worksheet

The background colour of the third worksheet has been changed to some blue
colour.

Inserting and Deleting rows and columns

The openpyxl providing a set of methods to the sheet class, that help to add and delete rows/columns from the excel sheet. I’m going to load the workbook, and then grab that active sheet and perform add/delete operations.

How To Install openpyxl Library

This module does not come built-in with Python 3. You can install this package into your python application by running of the following command into the terminal.

I am just extending the previous tutorial and adding functionality to insert and delete rows with columns.

How To Insert a Row into Excel File

You can insert rows using an excel file using the insert_rows() worksheet methods. The default is one row to insert into an excel file. The syntax is as follows:

Whereas: The first parameter represents row number and the second parameter represents a number of rows.

The sample python code to Inserting row into excel:

path = "C:\employee.xlsx"
wb_obj = openpyxl.load_workbook(path.strip())
sheet_obj = wb_obj.active
print("Maximum rows before inserting:", sheet_obj.max_row)
#insert 2 rows starting on the first row
sheet_obj.insert_rows(idx=3)

#insert multiple rows at once
#insert 3 rows starting on the six row
sheet_obj.insert_rows(6,3)

print("Maximum rows after inserting:", sheet_obj.max_row)

# save the file to the path
path = './employee.xlsx'
sheet_obj.save(path)

How To Insert a Column into Excel File

You can insert columns into the excel file using the worksheet methods. The default is one column to insert into excel file. The syntax is as follows:

Whereas : The first parameter represents column number and the second parameter represents the number of columns to add

The sample python code to Inserting Column into excel:

path = "C:\employee.xlsx"
wb_obj = openpyxl.load_workbook(path.strip())
sheet_obj = wb_obj.active
print("Maximum column before inserting:", sheet_obj.max_column)
#insert a column before first column A
sheet_obj.insert_cols(idx=1)

print("Maximum column after inserting:", sheet_obj.max_column)

# save the file to the path
path = './employee.xlsx'
sheet_obj.save(path)

Exploring the data

Now that we have read in the movies data set from our Excel file, we can start exploring it using pandas. A pandas DataFrame stores the data in a tabular format, just like the way Excel displays the data in a sheet. Pandas has a lot of built-in methods to explore the DataFrame we created from the Excel file we just read in.

We already introduced the method in the previous section that displays few rows from the top from the DataFrame. Let’s look at few more methods that come in handy while exploring the data set.

We can use the method to find out the number of rows and columns for the DataFrame.

This tells us our Excel file has 5042 records and 25 columns or observations. This can be useful in reporting the number of records and columns and comparing that with the source data set.

We can use the method to view the bottom rows. If no parameter is passed, only the bottom five rows are returned.

	Title	Year	Genres	Language	Country	Content Rating	Duration	Aspect Ratio	Budget	Gross Earnings	…	Facebook Likes – Actor 1	Facebook Likes – Actor 2	Facebook Likes – Actor 3	Facebook Likes – cast Total	Facebook likes – Movie	Facenumber in posters	User Votes	Reviews by Users	Reviews by Crtiics	IMDB Score
1599	War & Peace	NaN	Drama\|History\|Romance\|War	English	UK	TV-14	NaN	16.00	NaN	NaN	…	1000.0	888.0	502.0	4528	11000	1.0	9277	44.0	10.0	8.2
1600	Wings	NaN	Comedy\|Drama	English	USA	NaN	30.0	1.33	NaN	NaN	…	685.0	511.0	424.0	1884	1000	5.0	7646	56.0	19.0	7.3
1601	Wolf Creek	NaN	Drama\|Horror\|Thriller	English	Australia	NaN	NaN	2.00	NaN	NaN	…	511.0	457.0	206.0	1617	954	0.0	726	6.0	2.0	7.1
1602	Wuthering Heights	NaN	Drama\|Romance	English	UK	NaN	142.0	NaN	NaN	NaN	…	27000.0	698.0	427.0	29196	2.0	6053	33.0	9.0	7.7
1603	Yu-Gi-Oh! Duel Monsters	NaN	Action\|Adventure\|Animation\|Family\|Fantasy	Japanese	Japan	NaN	24.0	NaN	NaN	NaN	…	0.0	NaN	NaN	124	0.0	12417	51.0	6.0	7.0

5 rows × 25 columns

In Excel, you’re able to sort a sheet based on the values in one or more columns. In pandas, you can do the same thing with the method. For example, let’s sort our movies DataFrame based on the Gross Earnings column.

Since we have the data sorted by values in a column, we can do few interesting things with it. For example, we can display the top 10 movies by Gross Earnings.

We can also create a plot for the top 10 movies by Gross Earnings. Pandas makes it easy to visualize your data with plots and charts through matplotlib, a popular data visualization library. With a couple lines of code, you can start plotting. Moreover, matplotlib plots work well inside Jupyter Notebooks since you can displace the plots right under the code.

First, we import the matplotlib module and set matplotlib to display the plots right in the Jupyter Notebook.

We will draw a bar plot where each bar will represent one of the top 10 movies. We can do this by calling the plot method and setting the argument to . This tells to draw a horizontal bar plot.

Let’s create a histogram of IMDB Scores to check the distribution of IMDB Scores across all movies. Histograms are a good way to visualize the distribution of a data set. We use the method on the IMDB Scores series from our movies DataFrame and pass it the argument.

This data visualization suggests that most of the IMDB Scores fall between six and eight.

Используя pyexcel

Вы можете легко экспортировать свои массивы обратно в электронную таблицу с помощью функции save_as() и передать массив и имя целевого файла в аргумент dest_file_name.

Это позволяет нам указать разделитель и добавить аргумент dest_delimiter. Вы можете передать символ, который хотите использовать в качестве разделителя между “”.

Код:

 
# import xlsxwriter module    
import xlsxwriter    
     
book = xlsxwriter.Book('Example2.xlsx')    
sheet = book.add_sheet()    
      
# Rows and columns are zero indexed.    
row = 0   
column = 0   
     
content =     
     
# iterating through the content list    
for item in content :    
     
    # write operation perform    
    sheet.write(row, column, item)    
     
    # incrementing the value of row by one with each iterations.    
    row += 1   
         
book.close()

Выход:

Изучаю Python вместе с вами, читаю, собираю и записываю информацию опытных программистов.

Openpyxl read cell

In the following example, we read the previously written data from the
file.

read_cells.py

#!/usr/bin/env python

import openpyxl

book = openpyxl.load_workbook('sample.xlsx')

sheet = book.active

a1 = sheet
a2 = sheet
a3 = sheet.cell(row=3, column=1)

print(a1.value)
print(a2.value) 
print(a3.value)

The example loads an existing xlsx file and reads three cells.

book = openpyxl.load_workbook('sample.xlsx')

The file is opened with the method.

a1 = sheet
a2 = sheet
a3 = sheet.cell(row=3, column=1)

We read the contents of the A1, A2, and A3 cells. In the third line,
we use the method to get the value of A3 cell.

$ ./read_cells.py 
56
43
10/26/16

How to Read Multiple Excel (xlsx) Files in Python

In this section, we will learn how to read multiple xlsx files in Python using openpyxl. Additionally to openpyxl and Path, we are also going to work with the os module.

In the first step, we are going to import the modules Path, glob, and openpyxl:

2. Read all xlsx Files in the Directory to a List

Second, we are going to read all the .xlsx files in a subdirectory into a list. Now, we use the glob module together with Path:

3. Create Workbook Objects (i.e., read the xlsx files)

Third, we can now read all the xlsx files using Python. Again, we will use the load_workbook method. However, this time we will loop through each file we found in the subdirectory,

Now, in the code examples above, we are using Python list comprehension (twice, in both step 2 and 3). First, we create a list of all the xlsx files in the “XLSX_FILES” directory. Second, we loop through this list and create a list of workbooks. Of course, we could add this to the first line of code above.

4. Work with the Imported Excel Files

In the fourth step, we can now work with the imported excel files. For example, we can get the first file by adding “” to the list. If we want to know the sheet names of this file we do like this: .That is, many of the things we can do, and have done in the previous example on reading xlsx files in Python, can be done when we’ve read multiple Excel files.

Notice, this is one great example of how to use this programming language. Other examples are, for instance, to use it for renaming files in Python.

Read Cells from Multiple Rows or Columns

There are two methods that OpenPyXL’s worksheet objects give you for iterating over rows and columns. These are the two methods:

(int) – smallest column index (1-based index)
(int) – smallest row index (1-based index)
(int) – largest column index (1-based index)
(int) – largest row index (1-based index)
(bool) – whether only cell values should be returned

You use the min and max rows and column parameters to tell OpenPyXL which rows and columns to iterate over. You can have OpenPyXL return the data from the cells by setting to True. If you set it to False, and will return cell objects instead.

It’s always good to see how this works with actual code. With that in mind, create a new file named and add this code to it:

# iterating_over_cells_in_rows.py

from openpyxl import load_workbook


def iterating_over_values(path, sheet_name):
    workbook = load_workbook(filename=path)
    if sheet_name not in workbook.sheetnames:
        print(f"'{sheet_name}' not found. Quitting.")
        return

    sheet = workbook
    for value in sheet.iter_rows(
        min_row=1, max_row=3, min_col=1, max_col=3,
        values_only=True):
        print(value)


if __name__ == "__main__":
    iterating_over_values("books.xlsx", sheet_name="Sheet 1 - Books")

Here you load up the workbook as you have in the previous examples. You get the sheet name that you want to extract data from and then use to get the rows of data. In this example, you set the minimum row to 1 and the maximum row to 3. That means that you will grab the first three rows in the Excel sheet you have specified.

Then you also set the columns to be 1 (minimum) to 3 (maximum). Finally, you set to .

When you run this code, you will get the following output:

Your program will print out the first three columns of the first three rows in your Excel spreadsheet. Your program prints the rows as tuples with three items in them. You are using as a quick way to iterate over rows and columns in an Excel spreadsheet using Python.

Now you’re ready to learn how to read cells in a specific range.

Hidden feature: partial read

When you are dealing with huge amount of data, e.g. 64GB, obviously you would not
like to fill up your memory with those data. What you may want to do is, record
data from Nth line, take M records and stop. And you only want to use your memory
for the M records, not for beginning part nor for the tail part.

Hence partial read feature is developed to read partial data into memory for
processing.

You can paginate by row, by column and by both, hence you dictate what portion of the
data to read back. But remember only row limit features help you save memory. Let’s
you use this feature to record data from Nth column, take M number of columns and skip
the rest. You are not going to reduce your memory footprint.

Why did not I see above benefit?

This feature depends heavily on the implementation details.

Hence, during the partial data is been returned, the memory consumption won’t
differ from reading the whole data back. Only after the partial
data is returned, the memory comsumption curve shall jump the cliff. So pagination
code here only limits the data returned to your program.

In addition, pyexcel’s csv readers can read partial data into memory too.

Let’s assume the following file is a huge csv file:

>>> import datetime
>>> import pyexcel as pe
>>> data = 
...     1, 21, 31],
...     2, 22, 32],
...     3, 23, 33],
...     4, 24, 34],
...     5, 25, 35],
...     6, 26, 36
... 
>>> pe.save_as(array=data, dest_file_name="your_file.csv")

And let’s pretend to read partial data:

>>> pe.get_sheet(file_name="your_file.csv", start_row=2, row_limit=3)
your_file.csv
+---+----+----+
| 3 | 23 | 33 |
+---+----+----+
| 4 | 24 | 34 |
+---+----+----+
| 5 | 25 | 35 |
+---+----+----+

And you could as well do the same for columns:

>>> pe.get_sheet(file_name="your_file.csv", start_column=1, column_limit=2)
your_file.csv
+----+----+
| 21 | 31 |
+----+----+
| 22 | 32 |
+----+----+
| 23 | 33 |
+----+----+
| 24 | 34 |
+----+----+
| 25 | 35 |
+----+----+
| 26 | 36 |
+----+----+

Obvious, you could do both at the same time:

>>> pe.get_sheet(file_name="your_file.csv",
...     start_row=2, row_limit=3,
...     start_column=1, column_limit=2)
your_file.csv
+----+----+
| 23 | 33 |
+----+----+
| 24 | 34 |
+----+----+
| 25 | 35 |
+----+----+

The pagination support is available across all pyexcel plugins.

Note

No column pagination support for query sets as data source.

От: admin

Эта тема закрыта для публикации ответов.