How to Export Data to XLSX Files
2019-02-15
A while ago I wrote an article about exporting data to different spreadsheet formats. As recently I was reimplementing export to Excel for the 1st things 1st project, I noticed that the API changed a little, so it's time to blog about that again.
For Excel export I am using the XLSX file format which is a zipped XML-based format for spreadsheets with formatting support. XLSX files can be opened with Microsoft Excel, Apache OpenOffice, Apple Numbers, LibreOffice, Google Drive, and a handful of other applications. For building the XLSX file I am using openpyxl library.
Installing openpyxl
You can install openpyxl to your virtual environment the usual way with pip:
(venv) pip install openpyxl==2.6.0
Simplest Export View
To create a function exporting data from a QuerySet to XLSX file, you would need to create a view that returns a response with a special content type and file content as an attachment. Plug that view to URL rules and then link it from an export button in a template.
Probably the simplest view that generates XLSX file out of Django QuerySet would be this:
# movies/views.py
from datetime import datetime
from datetime import timedelta
from openpyxl import Workbook
from django.http import HttpResponse
from .models import MovieCategory, Movie
def export_movies_to_xlsx(request):
"""
Downloads all movies as Excel file with a single worksheet
"""
movie_queryset = Movie.objects.all()
response = HttpResponse(
content_type='application/vnd.openxmlformats-officedocument.spreadsheetml.sheet',
)
response['Content-Disposition'] = 'attachment; filename={date}-movies.xlsx'.format(
date=datetime.now().strftime('%Y-%m-%d'),
)
workbook = Workbook()
# Get active worksheet/tab
worksheet = workbook.active
worksheet.title = 'Movies'
# Define the titles for columns
columns = [
'ID',
'Title',
'Description',
'Length',
'Rating',
'Price',
]
row_num = 1
# Assign the titles for each cell of the header
for col_num, column_title in enumerate(columns, 1):
cell = worksheet.cell(row=row_num, column=col_num)
cell.value = column_title
# Iterate through all movies
for movie in movie_queryset:
row_num += 1
# Define the data for each cell in the row
row = [
movie.pk,
movie.title,
movie.description,
movie.length_in_minutes,
movie.rating,
movie.price,
]
# Assign the data for each cell of the row
for col_num, cell_value in enumerate(row, 1):
cell = worksheet.cell(row=row_num, column=col_num)
cell.value = cell_value
workbook.save(response)
return response
If you try this, you will notice, that there is no special formatting in it, all columns are of the same width, the value types are barely recognized, the header is displayed the same as the content. This is enough for further data export to CSV or manipulation with pandas. But if you want to present the data for the user in a friendly way, you need to add some magic.
Creating More Worksheets
By default, each Excel file has one worksheet represented as a tab. You can access it with:
worksheet = workbook.active
worksheet.title = 'The New Tab Title'
If you want to create tabs dynamically with data from the database of Python structures, you can at first delete the current tab and add the others with:
workbook.remove(workbook.active)
for index, category in enumerate(category_queryset):
worksheet = workbook.create_sheet(
title=category.title,
index=index,
)
Although not all spreadsheet applications support this, you can set the background color of the worksheet tab with:
worksheet.sheet_properties.tabColor = 'f7f7f9'
Working with Cells
Each cell can be accessed by its 1-based indexes for the rows and for the columns:
top_left_cell = worksheet.cell(row=1, column=1)
top_left_cell.value = "This is good!"
Styles and formatting are applied to individual cells instead of rows or columns. There are several styling categories with multiple configurations for each of them. You can find some available options from the documentation, but even more by exploring the source code.
from openpyxl.styles import Font, Alignment, Border, Side, PatternFill
top_left_cell.font = Font(name='Calibri', bold=True)
top_left_cell.alignment = Alignment(horizontal='center')
top_left_cell.border = Border(
bottom=Side(border_style='medium', color='FF000000'),
)
top_left_cell.fill = PatternFill(
start_color='f7f7f9',
end_color='f7f7f9',
fill_type='solid',
)
If you are planning to have multiple styled elements, instantiate the font, alignment, border, fill options upfront and then assign the instances to the cell attributes. Otherwise, you can get into memory issues when you have a lot of data entries.
Setting Column Widths
If you want to have some wider or narrower width for some of your columns, you can do this by modifying column dimensions. They are accessed by column letter which can be retrieved using a utility function:
from openpyxl.utils import get_column_letter
column_letter = get_column_letter(col_num)
column_dimensions = worksheet.column_dimensions[column_letter]
column_dimensions.width = 40
The units here are some relative points depending on the width of the letters in the specified font. I would suggest playing around with the width value until you find what works for you.
When defining column width is not enough, you might want to wrap text into multiple lines so that everything can be read by people without problems. This can be done with the alignment setting for the cell as follows:
from openpyxl.styles import Alignment
wrapped_alignment = Alignment(vertical='top', wrap_text=True)
cell.alignment = wrapped_alignment
Data Formatting
Excel automatically detects text or number types and aligns text to the left and numbers to the right. If necessary that can be overwritten.
There are some gotchas on how to format cells when you need a percentage, prices, or time durations.
Percentage
For percentage, you have to pass the number in float format from 0.0 till 1.0 and style should be 'Percent' as follows:
cell.value = 0.75
cell.style = 'Percent'
Currency
For currency, you need values of Decimal
format, the style should be 'Currency', and you will need a special number format for currency other than American dollars, for example:
from decimal import Decimal
cell.value = Decimal('14.99')
cell.style = 'Currency'
cell.number_format = '#,##0.00 €'
Durations
For time duration, you have to pass timedelta as the value and define special number format:
from datetime import timedelta
cell.value = timedelta(minutes=90)
cell.number_format = '[h]:mm;@'
This number format ensures that your duration can be greater than '23:59', for example, '140:00'.
Freezing Rows and Columns
In Excel, you can freeze rows and columns so that they stay fixed when you scroll the content vertically or horizontally. That's similar to position: fixed
in CSS.
To freeze the rows and columns, locate the top-left cell that is below the row that you want to freeze and is on the right from the column that you want to freeze. For example, if you want to freeze one row and one column, the cell would be 'B2'. Then run this:
worksheet.freeze_panes = worksheet['B2']
Fully Customized Export View
So having the knowledge of this article now we can build a view that creates separate sheets. for each movie category. Each sheet would list movies of the category with titles, descriptions, length in hours and minutes, rating in percent, and price in Euros. The tabs, as well as the headers, can have different background colors for each movie category. Cells would be well formatted. Titles and descriptions would use multiple lines to fully fit into the cells.
# movies/views.py
from datetime import datetime
from datetime import timedelta
from openpyxl import Workbook
from openpyxl.styles import Font, Alignment, Border, Side, PatternFill
from openpyxl.utils import get_column_letter
from django.http import HttpResponse
from .models import MovieCategory, Movie
def export_movies_to_xlsx(request):
"""
Downloads all movies as Excel file with a worksheet for each movie category
"""
category_queryset = MovieCategory.objects.all()
response = HttpResponse(
content_type='application/vnd.openxmlformats-officedocument.spreadsheetml.sheet',
)
response['Content-Disposition'] = 'attachment; filename={date}-movies.xlsx'.format(
date=datetime.now().strftime('%Y-%m-%d'),
)
workbook = Workbook()
# Delete the default worksheet
workbook.remove(workbook.active)
# Define some styles and formatting that will be later used for cells
header_font = Font(name='Calibri', bold=True)
centered_alignment = Alignment(horizontal='center')
border_bottom = Border(
bottom=Side(border_style='medium', color='FF000000'),
)
wrapped_alignment = Alignment(
vertical='top',
wrap_text=True
)
# Define the column titles and widths
columns = [
('ID', 8),
('Title', 40),
('Description', 80),
('Length', 15),
('Rating', 15),
('Price', 15),
]
# Iterate through movie categories
for category_index, category in enumerate(category_queryset):
# Create a worksheet/tab with the title of the category
worksheet = workbook.create_sheet(
title=category.title,
index=category_index,
)
# Define the background color of the header cells
fill = PatternFill(
start_color=category.html_color,
end_color=category.html_color,
fill_type='solid',
)
row_num = 1
# Assign values, styles, and formatting for each cell in the header
for col_num, (column_title, column_width) in enumerate(columns, 1):
cell = worksheet.cell(row=row_num, column=col_num)
cell.value = column_title
cell.font = header_font
cell.border = border_bottom
cell.alignment = centered_alignment
cell.fill = fill
# set column width
column_letter = get_column_letter(col_num)
column_dimensions = worksheet.column_dimensions[column_letter]
column_dimensions.width = column_width
# Iterate through all movies of a category
for movie in category.movie_set.all():
row_num += 1
# Define data and formats for each cell in the row
row = [
(movie.pk, 'Normal'),
(movie.title, 'Normal'),
(movie.description, 'Normal'),
(timedelta(minutes=movie.length_in_minutes), 'Normal'),
(movie.rating / 100, 'Percent'),
(movie.price, 'Currency'),
]
# Assign values, styles, and formatting for each cell in the row
for col_num, (cell_value, cell_format) in enumerate(row, 1):
cell = worksheet.cell(row=row_num, column=col_num)
cell.value = cell_value
cell.style = cell_format
if cell_format == 'Currency':
cell.number_format = '#,##0.00 €'
if col_num == 4:
cell.number_format = '[h]:mm;@'
cell.alignment = wrapped_alignment
# freeze the first row
worksheet.freeze_panes = worksheet['A2']
# set tab color
worksheet.sheet_properties.tabColor = category.html_color
workbook.save(response)
return response
The Takeaways
- Spreadsheet data can be used for further mathematical processing with pandas.
- XLSX file format allows quite a bunch of formatting options that can make your spreadsheet data more presentable and user-friendly.
- To see Excel export in action, go to 1st things 1st, log in as a demo user, and navigate to project results where you can export them as XLSX. Feedback is always welcome.
Cover photo by Tim Evans.
Also by me
Django Paddle Subscriptions app
For Django-based SaaS projects.
Django GDPR Cookie Consent app
For Django websites that use cookies.