Apache POI is an open source java library to create and manipulate various file formats based on Microsoft Office. Using POI, one should be able to perform create, modify and display/read operations on following file formats. For Example, Java doesn’t provide built-in support for working with excel files, so we need to look for open source APIs for the job.
Apache POI provides Java API for manipulating various file formats based on the Office Open XML (OOXML) standard and OLE2 standard from Microsoft. Apache POI releases are available under the Apache License (V2.0).
Important features:
- Apache POI provides stream-based processing, that is suitable for large files and requires less memory.
- Apache POI is able to handle both XLS and XLSX formats of spreadsheets.
- Apache POI contains HSSF implementation for Excel ’97(-2007) file format i.e XLS.
- Apache POI XSSF implementation should be used for Excel 2007 OOXML (.xlsx) file format.
- Apache POI HSSF and XSSF API provides mechanisms to read, write or modify excel spreadsheets.
- Apache POI also provides SXSSF API that is an extension of XSSF to work with very large excel sheets.
- SXSSF API requires less memory and is suitable when working with very large spreadsheets and heap memory is limited.
- There are two models to choose from – event model and user model. Event model requires less memory because the excel file is read in tokens and requires processing them. User model is more object oriented and easy to use .
- Apache POI provides excellent support for additional excel features such as working with Formulas, creating cell styles by filling colors and borders, fonts, headers and footers, data validations, images, hyperlinks etc.
Commonly used components of Apache POI:
- HSSF (Horrible Spreadsheet Format) : It is used to read and write xls format of MS-Excel files.
- XSSF (XML Spreadsheet Format) : It is used for xlsx file format of MS-Excel.
- POIFS (Poor Obfuscation Implementation File System) : This component is the basic factor of all other POI elements. It is used to read different files explicitly.
- HWPF (Horrible Word Processor Format) : It is used to read and write doc extension files of MS-Word.
- HSLF (Horrible Slide Layout Format) : It is used for read, create, and edit PowerPoint presentations.
Environment
Apache POI runtime dependencies : If you are working on a maven project, you can include the POI dependency in pom.xml file using this:
<dependency>
<groupId>org.apache.poi</groupId>
<artifactId>poi</artifactId>
<version>3.9</version>
</dependency>
To add this in eclipse: go to-
Window -> Show View -> Other -> Maven -> Maven Repositories